gianlp.models.trainable_model.TrainableModel
- class gianlp.models.trainable_model.TrainableModel(random_seed: int = 42)
- Bases: - BaseModel,- ABC- Class for models that are trainable. - It mimics Keras API. - Variables
- _random_seed – random_seed used in training and can be used for any random process of subclasses 
- _frozen – if the model was frozen, this is needed for older tensorflow versions 
 
 - Methods - Builds the whole chain of models in a recursive manner using the functional API. - Compiles the Keras model and prepares the text inputs to be used - Deserializes a model - Fits the model - Freezes the model weights - Predicts using the model - Given texts returns the array representation needed for forwarding the keras model - Serializes the model to be deserialized with the deserialize method - Attributes - Method for getting all models that serve as input - Returns the shapes of the inputs of the model - Returns the output shape of the model - Computes the total amount of trainable weights - Computes the total amount of weights - preprocess_texts(texts: Union[List[str], Series, Dict[str, List[str]], DataFrame]) Union[List[ndarray], ndarray]
- Given texts returns the array representation needed for forwarding the keras model - Parameters
- texts – the texts to preprocess 
- Returns
- a numpy array or list of numpy arrays representing the texts 
- Raises
- ValueError – - When the model is multi-text and x is not a dict or dataframe 
- When the model is not multi-text and x is a dict or dataframe 
 
 
 - compile(optimizer: Union[str, Optimizer] = 'rmsprop', loss: Optional[Union[str, Loss]] = None, metrics: Optional[List[Union[str, Metric]]] = None, **kwargs: Any) None
- Compiles the Keras model and prepares the text inputs to be used - Parameters
- optimizer – optimizer for training 
- loss – loss for training 
- metrics – metrics to use while training 
- **kwargs – - accepts any other parameters for use in Keras Model.compile API 
 
- Raises
- AssertionError – - When the model is not built 
 
 
 - build(texts: Union[List[str], Series, Dict[str, List[str]], DataFrame]) None
- Builds the whole chain of models in a recursive manner using the functional API. Some operations may need the model to be built. - Parameters
- texts – the texts for building if needed, some models have to learn from a sample corpus before working 
- Raises
- ValueError – If the multi-text input keys do not match with the ones in a multi-text model 
 
 - classmethod deserialize(data: bytes) BaseModel
- Deserializes a model - Parameters
- data – the data for deserializing 
- Returns
- a BaseModel object 
 
 - static get_bytes_from_model(model: Model, copy: bool = False) bytes
- Transforms a keras model into bytes - Parameters
- model – the keras model 
- copy – whether to copy the model before saving. copying the model is needed for complex nested models because the keras save/load can fail 
 
- Returns
- a byte array 
 
 - static get_model_from_bytes(data: bytes) Model
- Given bytes from keras model serialized with get_bytes_from_model method returns the model - Parameters
- data – the model bytes 
- Returns
- a keras model 
 
 - abstract property inputs: ModelInputsWrapper
- Method for getting all models that serve as input - Returns
- a ModelInputsWrapper 
 
 - abstract property inputs_shape: Union[List[ModelIOShape], ModelIOShape]
- Returns the shapes of the inputs of the model - Returns
- a list of shape tuple or shape tuple 
 
 - abstract property outputs_shape: Union[List[ModelIOShape], ModelIOShape]
- Returns the output shape of the model - Returns
- a list of shape tuple or shape tuple 
 
 - serialize() bytes
- Serializes the model to be deserialized with the deserialize method - Returns
- a byte array 
 
 - property trainable_weights_amount: Optional[int]
- Computes the total amount of trainable weights - Returns
- the total amount of trainable weights or none if not built 
 
 - property weights_amount: Optional[int]
- Computes the total amount of weights - Returns
- the total amount of weights or none if not built 
 
 - fit(x: Union[List[str], Series, Dict[str, List[str]], DataFrame], y: Union[List[ndarray], ndarray] = None)
- fit(x: Union[Generator[Tuple[Union[List[str], Series, Dict[str, List[str]], DataFrame], Union[List[ndarray], ndarray]], None, None], Sequence], y: None = None)
- Fits the model - Parameters
- x – - Input data. Could be: - A generator that yields (x, y) where x is any valid format for x and y is the target numpy array 
- A - gianlp.utils.Sequenceobject that generates (x, y) where x is any valid format for x and y is the target output
- A list of texts 
- A pandas Series 
- A pandas Dataframe 
- A dict of lists containing texts 
 
- y – Target, ignored if x is a generator. Numpy array. 
- batch_size – Batch size for training, ignored if x is a generator or a - gianlp.utils.Sequence
- epochs – Amount of epochs to train 
- verbose – verbose mode for Keras training 
- callbacks – list of Callback objects for Keras model 
- validation_split – the proportion of data to use for validation, ignored if x is a generator. Takes the last elements of x and y. Ignored if x is a generator or a - gianlp.utils.Sequenceobject
- validation_data – - Validation data. Could be: - *. A tuple containing (x, y) where x is a any valid format for x and y is the target numpy array *. A generator that yields (x, y) where x is a any valid format for x and y is the target numpy array *. - gianlp.utils.Sequenceobject that generates (x, y) where x is any valid format for x and y is the target output
- steps_per_epoch – Amount of generator steps to consider an epoch as finished. Ignored if x is not a generator 
- validation_steps – Amount of generator steps to consider to feed each validation evaluation. Ignored if validation_data is not a generator 
- max_queue_size – Maximum size for the generator queue. If unspecified, max_queue_size will default to 10. 
- workers – Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1. 
- use_multiprocessing – If True, use process-based threading. If unspecified, use_multiprocessing will default to False. Note that because this implementation relies on multiprocessing, you should not pass non-picklable arguments to the generator as they can’t be passed easily to children processes. 
- **kwargs – - extra arguments to give to keras.models.Model.fit 
 
- Returns
- A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). 
 
 - predict(x: Union[Generator[Union[List[str], Series, Dict[str, List[str]], DataFrame], None, None], List[str], Series, Dict[str, List[str]], DataFrame], inference_batch: int = 256) Union[List[ndarray], ndarray]
- predict(x: Sequence, inference_batch: int = 256) Union[List[ndarray], ndarray]
- Predicts using the model - Parameters
- x – - Could be: - A list of texts 
- A pandas Series 
- A pandas Dataframe 
- A dict of lists containing texts 
- A generator of any of the above formats 
- A - gianlp.utils.Sequenceobject that generates batches of text
 
- inference_batch – the prediction is made in batches for saving ram, this is the batch size used. ignored if x is a generator or a - gianlp.utils.Sequence
- steps – steps for the generator, ignored if x is not a generator 
- max_queue_size – Maximum size for the generator queue. If unspecified, max_queue_size will default to 10. 
- workers – Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1. 
- use_multiprocessing – If True, use process-based threading. If unspecified, use_multiprocessing will default to False. Note that because this implementation relies on multiprocessing, you should not pass non-picklable arguments to the generator as they can’t be passed easily to children processes. 
- verbose – 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = single line. 
 
- Returns
- the output of the keras model 
- Raises
- ValueError – If a generator is given as x but no step amount is specified 
 
 - freeze() None
- Freezes the model weights - Raises
- ValueError – When the model is not built