models¶
Model¶
- class msdss_models_api.models.Model(file_path='./model', file_ext='pickle', can_input=True, can_output=True, can_update=True, settings={})[source]¶
Template class to standardize modelling.
Methods
delete
,load
, andsave
are handled by default usingpickle
and do not need to be defined if there is no need for custom model saving and loading.Methods
input
,output
andupdate
need to be defined as they are placeholders for standardized functions of the model
- Parameters
file_path (str) – Path to save, load, and delete the model for persistence without the extension. Can be used in methods as
self.file
.file_ext (str) – File extension to save the model in.
can_input (bool) – Whether the method
.input
is defined and available. This is useful for controlling route requests in an API.can_output (bool) – Whether the method
.output
is defined and available. This is useful for controlling route requests in an API.can_update (bool) – Whether the method
.update
is defined and available. This is useful for controlling route requests in an API.settings (dict) – Dict of initial custom settings to be used by model methods. These are expected not to change from the time of initialization.
- instance¶
An instance of the model initialized with method
msdss_models_api.models.Model.input()
. Initial value isNone
.- Type
obj or None
- last_loaded¶
The date and time that the model was last loaded. If
None
, model has not been loaded. Useful for lazy loading.- Type
datetime.datetime
or None
- metadata¶
Single-value metadata for the model based on the user inputs to provide additional info:
can_input
(bool): same as parametercan_input
can_output
(bool): same as parametercan_output
can_update
(bool): same as parametercan_update
- Type
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model # Note - typically, you want to set all of the following: # Model(model=obj, file_path='path/to/model') model_path = file.name blank_model = Model(file_path=model_path) # Not saved yet, should be False can_load_before = blank_model.can_load() # Save the model blank_model.save() # Saved, should be True can_load_after = blank_model.can_load() blank_model.load() # loading works after save # To use the model, .input() and .output() needs to be defined # To update the model, .update() needs to be defined # The template model does nothing for these methods # It is recommended to extend this model and redefine these methods data = [1,2,3,4,5] blank_model.input(data) blank_model.output(data) blank_model.update(data) # After saving, you can delete the saved model # This will also set .instance to None blank_model.delete()
can_load¶
- Model.can_load()[source]¶
Checks if a model can be loaded using the save file.
- Returns
Whether the model can be loaded from the save file or not.
- Return type
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Check if model can be loaded blank_model.can_load()
delete¶
- Model.delete(force=False)[source]¶
Deletes the saved model and sets the attribute
.instance
toNone
.- Parameters
force (bool) – Whether to force deletion regardless if the files exist or not.
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Save model blank_model.save() # Delete saved model blank_model.delete()
input¶
- Model.input(data)[source]¶
Template method for input data to initialize model.
Requirements:
The first argument should be the input data seen in the parameters
Other arguments can be defined as any for the model after the first argument
Should set
self.instance
to the initialized model
Notes:
Does nothing but act as a template reference for class extension
This method should be re-defined using a class extension
- Parameters
data (dict or list or
pandas:pandas.DataFrame
) – Data to use for initializing the model. Should accept alist
ordict
to be input in apandas:pandas.DataFrame
or the dataframe itself.
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Calling this should initialize the model instance # blank_model.instance should be set train_data = [ {'col_a': 1, 'col_b': 'a'}, {'col_a': 2, 'col_b': 'b'} ] blank_model.input(train_data)
load¶
- Model.load(force=False)[source]¶
Loads a saved model.
If the model file has not been changed, skips loading based on
last_loaded
.- Parameters
force (bool) – Whether to force loading whether the saved model has been changed or not.
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Save model blank_model.save() # Load model blank_model.load()
needs_load¶
- Model.needs_load()[source]¶
Check if model needs to be loaded again.
- Returns
Whether the model needs to be loaded again based on whether the save
file
has changed.- Return type
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Save model blank_model.save() # Check if needs load blank_model.needs_load()
output¶
- Model.output(data)[source]¶
Template method for a model to output data such as predictions or clusters.
Requirements:
The first argument should be the input data seen in the parameters
Other arguments can be defined as any for the model output after the first argument
Ideally, should use
self.instance
to produce the output
Notes:
Does nothing but act as a template reference for class extension
This method should be re-defined using a class extension
- Parameters
data (dict or list or
pandas:pandas.DataFrame
) – Data to use as input for the model. Should accept alist
ordict
to be input in apandas:pandas.DataFrame
or the dataframe itself.- Returns
Output data from the model using the input data from parameter
data
.- Return type
pandas:pandas.DataFrame
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Calling this should initialize the model instance # blank_model.instance should be set train_data = [ {'col_a': 1, 'col_b': 'a'}, {'col_a': 2, 'col_b': 'b'} ] blank_model.input(train_data) # Calling this will produce model outputs but only after .input() is used # blank_model.instance should be used to produce the outputs test_data = [ {'col_a': 2, 'col_b': 'c'}, {'col_a': 3, 'col_b': 'd'} ] out = blank_model.output(test_data)
save¶
- Model.save()[source]¶
Saves the model to a file to be loaded.
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Save model blank_model.save()
update¶
- Model.update(data)[source]¶
Template method for updating a model with new data.
Requirements:
The first argument should be the input data seen in the parameters
Other arguments can be defined as any for the model output after the first argument
Ideally, should update
self.instance
Notes:
Does nothing but act as a template reference for class extension
This method should be re-defined using a class extension
- Parameters
data (dict or list or
pandas:pandas.DataFrame
) – Data to use for updating the model. Should accept alist
ordict
to be input in apandas:pandas.DataFrame
or the dataframe itself.
Author
Richard Wen <rrwen.dev@gmail.com>
Example
import tempfile from msdss_models_api.models import Model with tempfile.NamedTemporaryFile() as file: # Create template model model_path = file.name blank_model = Model(file_path=model_path) # Calling this should initialize the model instance # blank_model.instance should be set train_data = [ {'col_a': 1, 'col_b': 'a'}, {'col_a': 2, 'col_b': 'b'} ] blank_model.input(train_data) # Calling this will update the model but only after .input() is used # blank_model.instance should be updated with the new data new_data = [ {'col_a': 2, 'col_b': 'c'}, {'col_a': 3, 'col_b': 'd'} ] blank_model.update(new_data)
ModelCreate¶
- class msdss_models_api.models.ModelCreate(*, title: str = None, description: str = None, source: str = None, tags: str = None, settings: Dict[str, Any] = None)[source]¶
Class for creating creating models using the API.
Author
Richard Wen <rrwen.dev@gmail.com>
Example
from msdss_models_api.models import * from pprint import pprint fields = ModelCreate.__fields__ pprint(fields)
{'description': ModelField(name='description', type=Optional[str], required=False, default=None), 'settings': ModelField(name='settings', type=Optional[Mapping[str, Any]], required=False, default=None), 'source': ModelField(name='source', type=Optional[str], required=False, default=None), 'tags': ModelField(name='tags', type=Optional[str], required=False, default=None), 'title': ModelField(name='title', type=Optional[str], required=False, default=None)}
ModelMetadataUpdate¶
- class msdss_models_api.models.ModelMetadataUpdate(*, title: str = None, description: str = None, source: str = None, tags: str = None)[source]¶
Model for updating model metadata.
Author
Richard Wen <rrwen.dev@gmail.com>
Example
from msdss_models_api.models import * from pprint import pprint fields = ModelMetadataUpdate.__fields__ pprint(fields)
{'description': ModelField(name='description', type=Optional[str], required=False, default=None), 'source': ModelField(name='source', type=Optional[str], required=False, default=None), 'tags': ModelField(name='tags', type=Optional[str], required=False, default=None), 'title': ModelField(name='title', type=Optional[str], required=False, default=None)}