NumPy Example
Description
This example demonstrates how to import external files using the TAR functionality and how to use built-in libraries like NumPy. In this example, the user specifies an input file (defined as an input parameter) that should already be packaged in the fs.tar file. The script will import the NumPy library, open the file, perform some basic operations on the data, and then display an output value.
Click here to see the full list of compatible libraries
Executing NumPy Example
We use Truebit to execute the NumPy task to get a verified result.
The Code
You can find the Numpy example within the folder truebit-nextgen-examples/function-tasks/js/numpy-pandas
📄 task.py 📄 create-fs.sh 📄 gendata.py
task.py
task.py
import numpy as np
import os
def main():
# Load filepath of input CSV
try:
with open("input.txt", "r") as file:
input_file = file.read().strip()
except FileNotFoundError:
print("Error: The file 'input.txt' does not exist.")
return
except Exception as e:
print(f"An error occurred while reading 'input.txt': {e}")
return
# Read the input file (assuming it contains CSV data)
data = []
try:
with open(input_file, 'r') as file:
header = file.readline().strip().split(',')
for line in file:
values = line.strip().split(',')
data.append(values)
except FileNotFoundError:
print("Error: The file '{input_file}' does not exist.")
return
except Exception as e:
print(f"An error occurred while reading '{input_file}': {e}")
return
# Convert columns to appropriate types
A = np.array([float(row[0]) for row in data])
B = np.array([float(row[1]) for row in data])
C = np.array([row[2] for row in data])
# Display the original data
original_output = "Original Data:\n" + str(data)
# Calculate basic statistics using numpy
mean_A = np.mean(A)
std_A = np.std(A)
sum_B = np.sum(B)
max_B = np.max(B)
statistics = (
f"\nMean of column 'A': {mean_A}\n"
f"Standard deviation of column 'A': {std_A}\n"
f"Sum of column 'B': {sum_B}\n"
f"Maximum value of column 'B': {max_B}"
)
# Add a new column with the natural logarithm of 'B' values using numpy
log_B = np.log(B + 1) # +1 to avoid log(0)
# Create a correlation matrix using numpy
correlation_matrix = np.corrcoef(A, B)
correlation_output = "Correlation matrix between 'A' and 'B':\n" + str(correlation_matrix)
# Normalize column 'A' using numpy
A_normalized = (A - mean_A) / std_A
# Filter rows where column 'A' is greater than its mean
filtered_data = np.array([row for row in data if float(row[0]) > mean_A])
# Add a date column (as a string for simplicity)
dates = np.arange('2023-01-01', '2024-01-01', dtype='datetime64[D]').astype(str)
# Group by column 'C' (assuming C is categorical with values 'X', 'Y', 'Z')
unique_C, indices_C = np.unique(C, return_inverse=True)
group_means_A = [np.mean(A[indices_C == i]) for i in range(len(unique_C))]
grouped_output = "Mean of 'A' grouped by 'C':\n" + str(dict(zip(unique_C, group_means_A)))
# Apply a lambda function to create a new column 'A_category'
A_category = np.array(['High' if x > 50 else 'Low' for x in A])
# Handling missing values: introduce some NaN values in 'B' and then fill them
B_with_nan = B.copy()
B_with_nan[::3] = np.nan
B_filled = np.where(np.isnan(B_with_nan), np.nanmean(B_with_nan), B_with_nan)
# Prepare the output to be written to output.txt
output_content = (
original_output + statistics + "\n\n" +
"Log of B:\n" + str(log_B) + "\n\n" +
"Normalized A:\n" + str(A_normalized) + "\n\n" +
"Filtered Data (A > Mean of A):\n" + str(filtered_data) + "\n\n" +
correlation_output + "\n\n" +
grouped_output + "\n\n" +
"A Category:\n" + str(A_category) + "\n\n" +
"B with NaNs filled:\n" + str(B_filled)
)
# Save the results to output.txt
with open('output.txt', 'w') as file:
file.write(output_content)
print(f"Results written to 'output.txt'.")
if __name__ == "__main__":
main()In/Out Parameters
In order to send parameters to the Truebit task, you need to open the "input.txt" file and get the value from there. For now, only one input parameter is allowed.
In order to retrieve the output value from the Truebit task, you need save it in the "output.txt" file.
create-fs.sh
This example uses external files to import data for processing. The create-fs.sh script is a batch file that runs the tar command to package the files into a single archive. The resulting fs.tar file will be used as input data in the Function Task.
Trying Out
We will use the Truebit CLI to compile and test our source code. Once finalized, we will deploy it to the Coordination Hub so that any user with the namespace and taskname can access or execute it.
Step 1: Create The NumPy Example Source Code
Within the numpy folder you will find a file called task.py.
Step 2: Build The Source Code
Execute the build command against Truebit node to get an instrumented Truebit task.
Output
The taskId will always starts with the language prefix + "_" + cID
Step 3: Try The Code
Execute the start command against the Truebit node to test our Algorithm. You will need to submit the instrumented task id + the input parameter.
[
taskId]: Add the taskId generated in the previous step.
Output
Step 4: Deploy The Task
Last, but not least, Execute the deploy command to deploy our task to the coordination hub, so that anyone with the namespace, taskname and the API key can execute it.
[
taskId]: Add the taskId generated in step 2
Output
Last updated
Was this helpful?