TrainingData#

A TrainingData instance is a collection of TrainingDataPart instances representing a prediction that can be used as input for the training of models.

Directory#

class TrainingDataDirectory#

Provides a collection of methods related to training data.

This class is accessed through client.training_data.

Example

List all of the training data:

import ansys.simai.core

simai = ansys.simai.core.from_config()
simai.training_data.list()
create(name: str, project: Project | str | None = None) TrainingData#

Create a TrainingData object.

Parameters:
Returns:

Created TrainingData object.

Return type:

TrainingData

delete(training_data: TrainingData | str) None#

Delete a TrainingData object and its associated parts from the server.

Parameters:

training_data (TrainingData | str) – ID or model object of the TrainingData object.

get(id) TrainingData#

Get a specific TrainingData object from the server.

list(filters: dict[str, Any] | list[tuple[str, Literal['EQ', 'LIKE', 'IN', 'GT', 'GTE', 'LT', 'LTE'], Any]] | list[RawSingleFilter] | None = None) List[TrainingData]#

List all TrainingData objects on the server.

Parameters:

filters (dict[str, Any] | list[tuple[str, Literal['EQ', 'LIKE', 'IN', 'GT', 'GTE', 'LT', 'LTE'], ~typing.Any]] | list[~ansys.simai.core.data.types.RawSingleFilter] | None) – Optional Filters to apply.

Returns:

List of all TrainingData objects on the server.

Return type:

List[TrainingData]

upload_folder(training_data: TrainingData | str, folder_path: Path | str | PathLike) List[TrainingDataPart]#

Upload all files in a folder to a TrainingData object.

This method automatically requests computation of the training data once the upload is complete unless specified otherwise.

Parameters:
  • training_data (TrainingData | str) – ID or model object of the training data to upload parts to.

  • folder_path (Path | str | PathLike) – Path to the folder that contains the files to upload.

upload_part(training_data: TrainingData | str, file: Path | str | PathLike | Tuple[BinaryIO | RawIOBase | BufferedIOBase | Path | str | PathLike, str], monitor_callback: Callable[[int], None] | None) TrainingDataPart#

Add a part to a TrainingData object.

Parameters:
Returns:

Added TrainingDataPart object.

Return type:

TrainingDataPart

Model#

class TrainingData#

Provides the local representation of a training data object.

add_to_project(project: Project | str)#

Add the training data to a Project object.

Parameters:

project (Project | str) – ID or model object of the project to add the data to.

assign_subset(project: Project | str, subset: SubsetEnum | str | None) None#

Assign the training data to a subset in relation to a given project.

Parameters:
  • project (Project | str) – ID or model

  • subset (SubsetEnum | str | None) –

    SubsetEnum attribute (e.g. SubsetEnum.TRAINING) or string value (e.g. “Training”) or None to unassign. Available options: (Training, Test)

    Each new training data added to the project will be set to “None” by default.

    None allows for resetting the subset assignment of your training data, which will be automatically allocated in either test or training subsets upon each model building request. As a rule of thumb, 10% of all data should be allocated to the test subset.

    When wanting to assign a specific subset to your training data, note that:

    Each subset requires at least one data point. The training subset is used to train the model. The test subset is used for the model evaluation report but is not learned by the model. It is recommended to allocate about 10% of your data to the test subset.

Returns:

None

Return type:

None

delete() None#

Delete the training data on the server.

extract_data() None#

Extract or reextract the data from a training data.

Data should be extracted from a training data once all its parts have been fully uploaded. This is done automatically when using TrainingDataDirectory.upload_folder() to create training data.

Data can only be reextracted from a training data if the extraction previously failed or if new files have been added.

get_subset(project: Project | str) SubsetEnum | None#

Get the subset that the training data belongs to, in relation to the given project.

Parameters:

project (Project | str) – ID or model of the project to check the Project object for, or its ID.

Returns:

The SubsetEnum of the subset to which the TrainingData belongs to if any, None otherwise. (e.g. <SubsetEnum.TEST: ‘Test’>)

Return type:

SubsetEnum | None

reload() None#

Refresh the object with its representation from the server.

remove_from_project(project: Project | str)#

Remove the training data from a Project object.

Parameters:

project (Project | str) – ID or model of the project to remove data from.

Raises:
upload_folder(folder_path: Path | str | PathLike) List[TrainingDataPart]#

Upload all the parts contained in a folder to a TrainingData instance.

Upon upload completion, SimAI will extract data from each part.

Parameters:

folder_path (Path | str | PathLike) – Path to the folder with the files to upload.

Returns:

List of uploaded training data parts.

Return type:

List[TrainingDataPart]

upload_part(file: Path | str | PathLike | Tuple[BinaryIO | RawIOBase | BufferedIOBase | Path | str | PathLike, str], monitor_callback: Callable[[int], None] | None = None) TrainingDataPart#

Add a part to the training data.

Parameters:
Returns:

Created TrainingDataPart.

Return type:

TrainingDataPart

wait(timeout: float | None = None) bool#

Wait for all jobs concerning the object to either finish or fail.

Parameters:

timeout (float | None) – Maximum amount of time in seconds to wait. The default is None, in means that there is no maximum on the time to wait.

Returns:

True if the computation has finished, False if the operation timed out.

Return type:

bool

property extracted_metadata: Dict | None#

Metadata extracted from the training data.

property failure_reason#

Optional message giving the causes for why the creation of the object failed.

See also

property fields: dict#

Dictionary containing the raw object representation.

property has_failed#

Boolean indicating if the creation of the object failed.

property id: str#

ID of the object on the server.

property is_pending#

Boolean indicating if the object is still in creation. The value becomes False once object creation is either successful or has failed.

property is_ready#

Boolean indicating if the object has finished creating without error.

property name: str#

Name of the training data.

property parts: List[TrainingDataPart]#

List of all parts objects in the training data.