wolpert.wrappers.time_series module¶
-
class
wolpert.wrappers.time_series.
TimeSeriesSplit
(offset=0, test_set_size=1, min_train_size=1, max_train_size=None)[source]¶ Time Series cross-validator
Provides train/test indices to split time series data samples that are observed at fixed time intervals, in train/test sets. In each split, test indices must be higher than before, and thus shuffling in cross validator is inappropriate.
This cross-validation object is a variation of
KFold
. In the kth split, it returns first k folds as train set and the (k+1)th fold as test set.Note that unlike standard cross-validation methods, successive training sets are supersets of those that come before them.
Read more in the User Guide.
Parameters: - min_train_size : int, optional (default=1)
Minimum size for a single training set.
- max_train_size : int, optional (default=None)
Maximum size for a single training set.
- offset : integer, optional (default=0)
Number of rows to skip after the last train split rows
- test_set_size : integer, optional (default=1)
Size of the test set. This will also be the amount of rows added to the training set at each iteration
Methods
split
(X[, y, groups])Generate indices to split data into training and test set. -
split
(X, y=None, groups=None)[source]¶ Generate indices to split data into training and test set.
Parameters: - X : array-like, shape (n_samples, n_features)
Training data, where n_samples is the number of samples and n_features is the number of features.
- y : array-like, shape (n_samples,)
Always ignored, exists for compatibility.
- groups : array-like, with shape (n_samples,), optional
Always ignored, exists for compatibility.
Yields: - train : ndarray
The training set indices for that split.
- test : ndarray
The testing set indices for that split.
-
class
wolpert.wrappers.time_series.
TimeSeriesStackableTransformer
(estimator, method='auto', scoring=None, verbose=False, offset=0, test_set_size=1, min_train_size=1, max_train_size=None, n_cv_jobs=1)[source]¶ Transformer to turn estimators into meta-estimators for model stacking
Each split is composed by a train set containing the first
t
rows in the data set and a test set composed of rowst+k
tot+k+n
, wherek
andn
are the offset and test_set_size parameters.Parameters: - estimator : predictor
The estimator to be blended.
- method : string, optional (default=’auto’)
This method will be called on the estimator to produce the output of transform. If the method is
auto
, will try to invoke, for each estimator,predict_proba
,decision_function
orpredict
in that order.- scoring : string, callable, dict or None (default=None)
If not
None
, will save scores generated by the scoring object on thescores_
attribute each time blend is called.- verbose : bool (default=False)
When true, prints scores to stdout. scoring must not be
None
.- offset : integer, optional (default=0)
Number of rows to skip after the last train split rows
- test_set_size : integer, optional (default=1)
Size of the test set. This will also be the amount of rows added to the training set at each iteration
- min_train_size : int, optional (default=1)
Minimum size for a single training set.
- max_train_size : int, optional (default=None)
Maximum size for a single training set.
- n_cv_jobs : int, optional (default=1)
Number of jobs to be passed to
cross_val_predict
duringblend
.
Methods
blend
(X, y, **fit_params)Transform dataset using time series split. fit
(X[, y])Fit the estimator. fit_blend
(X, y, **fit_params)Transform dataset using cross validation and fits the estimator to the entire dataset. fit_transform
(X[, y])Fit to data, then transform it. get_params
([deep])Get parameters for this estimator. set_params
(**params)Set the parameters of this estimator. transform
(*args, **kwargs)Transform the whole dataset. -
blend
(X, y, **fit_params)[source]¶ Transform dataset using time series split.
Parameters: - X : array-like or sparse matrix, shape=(n_samples, n_features)
Input data used to build forests. Use
dtype=np.float32
for maximum efficiency.- y : array-like, shape = [n_samples]
Target values.
- **fit_params : parameters to be passed to the base estimator.
Returns: - X_transformed, indexes : tuple of (sparse matrix, array-like)
X_transformed is the transformed dataset. indexes is the indexes of the transformed data on the input.
-
fit
(X, y=None, **fit_params)[source]¶ Fit the estimator.
Parameters: - X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Training vectors, where n_samples is the number of samples and n_features is the number of features.
- y : array-like, shape = [n_samples]
Target values.
- **fit_params : parameters to be passed to the base estimator.
Returns: - self : object
-
fit_blend
(X, y, **fit_params)[source]¶ Transform dataset using cross validation and fits the estimator to the entire dataset.
Parameters: - X : array-like or sparse matrix, shape=(n_samples, n_features)
Input data used to build forests. Use
dtype=np.float32
for maximum efficiency.- y : array-like, shape = [n_samples]
Target values.
- **fit_params : parameters to be passed to the base estimator.
Returns: - X_transformed, indexes : tuple of (sparse matrix, array-like)
X_transformed is the transformed dataset. indexes is the indexes of the transformed data on the input.
-
fit_transform
(X, y=None, **fit_params)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X : numpy array of shape [n_samples, n_features]
Training set.
- y : numpy array of shape [n_samples]
Target values.
Returns: - X_new : numpy array of shape [n_samples, n_features_new]
Transformed array.
-
get_params
(deep=True)[source]¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
set_params
(**params)[source]¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: - self
-
transform
(*args, **kwargs)[source]¶ Transform the whole dataset.
Parameters: - X : array-like or sparse matrix, shape=(n_samples, n_features)
Input data to be transformed. Use
dtype=np.float32
for maximum efficiency. Sparse matrices are also supported, use sparsecsr_matrix
for maximum efficiency.
Returns: - X_transformed : sparse matrix, shape=(n_samples, n_out)
Transformed dataset.
-
class
wolpert.wrappers.time_series.
TimeSeriesWrapper
(default_method='auto', default_scoring=None, verbose=False, offset=0, test_set_size=1, min_train_size=1, max_train_size=None, n_cv_jobs=1)[source]¶ Helper class to wrap estimators with
TimeSeriesStackableTransformer
Parameters: - default_method : string, optional (default=’auto’)
This method will be called on the estimator to produce the output of transform. If the method is
auto
, will try to invoke, for each estimator,predict_proba
,decision_function
orpredict
in that order.- default_scoring : string, callable, dict or None (default=None)
If not
None
, will save scores generated by the scoring object on thescores_
attribute each time blend is called.- verbose : bool (default=False)
When true, prints scores to stdout. scoring must not be
None
.- offset : integer, optional (default=0)
Number of rows to skip after the last train split rows
- test_set_size : integer, optional (default=1)
Size of the test set. This will also be the amount of rows added to the training set at each iteration
- min_train_size : int, optional (default=1)
Minimum size for a single training set.
- max_train_size : int, optional (default=None)
Maximum size for a single training set.
- n_cv_jobs : int, optional (default=1)
Number of jobs to be passed to
cross_val_predict
duringblend
.
Methods
wrap_estimator
(estimator[, method])Wraps an estimator and returns a transformer that is suitable for stacking. -
wrap_estimator
(estimator, method=None, **kwargs)[source]¶ Wraps an estimator and returns a transformer that is suitable for stacking.
Parameters: - estimator : predictor
The estimator to be blended.
- method : string or None, optional (default=None)
If not
None
, his method will be called on the estimator instead ofdefault_method
to produce the output of transform. If the method isauto
, will try to invoke, for each estimator,predict_proba
,decision_function
orpredict
in that order.
Returns: - t : TimeSeriesStackableTransformer