AryaXAI SDK Documentation | Build & Integrate Explainable AI and Alignment Tools

Upon accessing the project, your initial task is to upload pertinent data sets. These may encompass data utilized for training, testing, validation, production, or any other data integral to your project's scope and requirements.

Data upload in Tabular projects

To upload data, we need to pass the file path and Tag.

To upload data to the project, you have the option to either directly provide a file path or pass a Pandas DataFrame.

If you are uploading data for the first time, it's necessary to configure the project details in the 'Project config'. This config will be used for all further Operations and cannot be changed once set.

You can achieve this and upload data through our SDK by utilizing the following commands:


config = {
            "project_type": "classification",  # The Prediction Type of your project (classification / regression)
            "unique_identifier": "id", # unique identifier for your project
            "true_label": "loan_status", # Target label
            "pred_label": "", # Define predicted value if you are looking to use Surrogate models for explainability.
            "feature_exclude": [],  # feature you are not using in your model or in the XAI model.
            "drop_duplicate_uid": True
        }

Tag = 'Training'  # Data is diffrentiated using Tag

To upload the data into the project. This will also build the initial ML model.

NOTE: To upload model we first need to make sure the features going into model are already uploaded using data upload.

‍


project.upload_data('file_path','tag', config)

#Help on method upload_data
help(project.upload_data)

‍

Help function to upload Data description:


help(project.upload_data_description)

Data can be uploaded to the project either directly with file or by passing Pandas DataFrame.

Data upload in Image projects

Data upload

To upload data using the SDK, follow these steps:

1. Upload the Model File

Before uploading any data, ensure that the model file (typically .h5) is uploaded.

2. Upload the Data File

You can upload data in .zip format. The data upload can accept:

A file path to the zipped dataset
A Pandas DataFrame (if working within a Python environment)

3. Initialize Project Configuration (First-Time Setup Only)

If you're uploading data for the first time, it's required to set the Project Configuration. This configuration is immutable and will be used for all subsequent operations within the project.

You can configure and upload your model and data using the following SDK command:


project.upload_data(
    data="/content/content/data.zip",
    tag="training",
    model="/content/content/model1.h5",
    model_architecture="deep_learning",
    model_type="tensorflow",
    model_name="cifar1"
)

4. View Project Configuration

Once the upload is complete, you can verify the configuration:


project.config()

NOTE: The config() function may throw an error if explainability generation is still in progress. Please allow some time for it to be complete before checking the configuration.

Uploading Additional Data (e.g., Testing or Validation)

After the initial configuration is set, you can upload additional datasets (like testing or validation) without reconfiguring the project:


#Help on method upload_data
help(project.upload_data)

Additional functions:

Fetch the list of uploaded files and data


project.files()

delete uploaded file


project.delete_file('file_name')

To fetch all tags uploaded by the user:


project.tags()

Data upload through data connectors:

You can upload data using various data connectors, including S3, GCS, Google Drive, SFTP, and Dropbox. The availability of these connectors will depend on your subscription plan.

To create data connectors at the organizational level:


config = {
    "access_key": "",
    "secret_key": ""
}

s3_connector = organization.create_data_connectors(data_connector_name="", data_connector_type="s3", s3_config=config)
s3_connector

To list buckets at the organizational level:


buckets = organization.list_data_connectors()
buckets

To list file paths at the organizational level:


filepaths = organization.list_data_connectors_filepath(data_connector_name="gcs1", bucket_name="aryaxai-testing")
filepaths

To list data connectors at the organizational level:


link_services = organization.list_data_connectors()
link_services

For the initial upload of the model and data via data connectors, use the following SDK functionality:


project.upload_data_dataconnectors(
    file_path="Image Classification/cifar/500 data/cifar10trainmini.zip",
    tag="training",
    model_path="Image Classification/cifar/cifar10epoch100.h5",
    model_architecture="deep_learning",
    model_type="tensorflow",
    model_name="cifar1",
    data_connector_name="s31",
    bucket_name="aryaxai-testing",
)

S3:

Create data connector


config = {
    "access_key": "",
    "secret_key": ""
}

s3_connector = project.create_data_connectors(data_connector_name="", data_connector_type="s3", s3_config=config)
s3_connector

GCS

Create data connector


onfig = {
    "gcp_project_name": "",
    "type": "service_account",
    "project_id": "",
    "private_key_id": "",
    "private_key": "",
    "client_email": "",
    "client_id": "",
    "auth_uri": "",
    "token_uri": ""
}

gcs_connector = project.create_data_connectors(data_connector_name="", data_connector_type="gcs", gcs_config=config)
gcs_connector

Google Drive

Create data connector


config = {
  "type": "service_account",
  "project_id": "",
  "private_key_id": "",
  "private_key": "",
  "client_email": "",
  "client_id": "",
  "auth_uri": "",
  "token_uri": "",}
gdrive_connector = project.create_data_connectors(data_connector_name="gdrive_2", data_connector_type="gdrive", gdrive_config=config)
print(gdrive_connector)

SFTP

Create data connector


config = {
    "hostname": "",
    "port": ,
    "username": "",
    "password": ""
}

sftp_connector = project.create_data_connectors(data_connector_name="", data_connector_type="sftp", sftp_config=config)
print(sftp_connector)

Dropbox

Create data connector


dbx_connector = project.create_data_connectors(data_connector_name="", data_connector_type="dropbox")

‍

To view all linked data connectors for current project:


link_services = project.list_data_connectors()
link_services

Once the data connectors are created, you can test the connect using the below function:

Test connection


project.test_data_connectors(data_connector_name="")

List buckets


buckets= project.list_data_connectors_buckets(data_connector_name="")
buckets

List File paths


filepaths = project.list_data_connectors_filepath(data_connector_name="", bucket_name="")
filepaths["file_paths"]

Upload Data


project.upload_data_dataconnectors(data_connector_name="", tag="training", bucket_name="", file_path='', config=config)

Upload Feature Mapping


project.project.upload_feature_mapping_dataconnectors(data_connector_name="", bucket_name="", file_path='')
(data_connector_name="")

Upload Data description


project.upload_data_description_dataconnectors(data_connector_name = "", bucket_name = "", file_path = "")

Additional functions:

To see uploaded model info.:


project.models()

‍

Once the data is uploaded, you can also view the files, and file info through SDK.


#Check the files that are uploaded in the project.
project.files()

Some additional functions:


#fetch the detailed analytics of each uploaded files individually
project.file_summary()

# fetch the list of uploaded files and data
project.files()

‍

NOTE: Above config function may throw error until Explainbility is generated, so please wait until the config() works.

‍

Once uploaded you can see your Project Config. Check feature exclude and include and match with your setting.

Additionally, AryaXAI AutoML framework may choose to remove additional columns if the missing values are greater than 30%. You can override this in the AutoML model settings and retrain the model


#To know all the settings: Data, Data Encoding & Model params
project.config()

‍

Once the initial data configuration is completed, you can upload additional data sets, such as testing, validation data, or you can add with your own tag without needing to reconfigure the settings.

Additionally, you can also delete the uploaded file:


#project.delete_file('file_name')

‍

To fetch all tags which user has uploaded


project.tags()

#you can view the data using:
project.tag_data('XGBoost_default_testdata')

‍

NOTE: "_test" are the tags generated automatically by AryaXAI post running the respective model on 'Test' data.

Example H2

Example H3