stats

stats#

Source code: tianshou/data/stats.py

class SequenceSummaryStats(*, mean: float, std: float, max: float, min: float)[source]#

Bases: DataclassPPrintMixin

A data structure for storing the statistics of a sequence.

mean: float#

std: float#

max: float#

min: float#

classmethod from_sequence(sequence: Sequence[float | int] | ndarray) → SequenceSummaryStats[source]#

classmethod from_single_value(value: float | int) → SequenceSummaryStats[source]#

compute_dim_to_summary_stats(arr: Sequence[Sequence[float]] | ndarray) → dict[int, SequenceSummaryStats][source]#

Compute summary statistics for each dimension of a sequence.

Parameters:: arr – a 2-dim arr (or sequence of sequences) from which to compute the statistics.
Returns:: A dictionary of summary statistics for each dimension.

class TimingStats(*, total_time: float = 0.0, train_time: float = 0.0, train_time_collect: float = 0.0, train_time_update: float = 0.0, test_time: float = 0.0, update_speed: float = 0.0)[source]#

Bases: DataclassPPrintMixin

A data structure for storing timing statistics.

total_time: float = 0.0#: The total time elapsed.

train_time: float = 0.0#: The total time elapsed for training (collecting samples plus model update).

train_time_collect: float = 0.0#: The total time elapsed for collecting training transitions.

train_time_update: float = 0.0#: The total time elapsed for updating models.

test_time: float = 0.0#: The total time elapsed for testing models.

update_speed: float = 0.0#: The speed of updating (env_step per second).

class InfoStats(*, update_step: int, best_score: float, best_reward: float, best_reward_std: float, train_step: int, train_episode: int, test_step: int, test_episode: int, timing: TimingStats)[source]#

Bases: DataclassPPrintMixin

A data structure for storing information about the learning process.

update_step: int#: The total number of update steps that have been taken.

best_score: float#: The best score over the test results. The one with the highest score will be considered the best model.

best_reward: float#: The best reward over the test results.

best_reward_std: float#: Standard deviation of the best reward over the test results.

train_step: int#: The total collected step of training collector.

train_episode: int#: The total collected episode of training collector.

test_step: int#: The total collected step of test collector.

test_episode: int#: The total collected episode of test collector.

timing: TimingStats#: The timing statistics.

class EpochStats(*, epoch: int, train_collect_stat: CollectStatsBase | None, test_collect_stat: CollectStats | None, training_stat: TrainingStats | None, info_stat: InfoStats)[source]#

Bases: DataclassPPrintMixin

A data structure for storing epoch statistics.

epoch: int#: The current epoch.

train_collect_stat: CollectStatsBase | None#: The statistics of the last call to the training collector.

test_collect_stat: CollectStats | None#: The statistics of the last call to the test collector.

training_stat: TrainingStats | None#: The statistics of the last model update step. Can be None if no model update is performed, typically in the last training iteration.

info_stat: InfoStats#: The information of the collector.

stats

Contents

stats#