Migration notes¶
To petastorm 0.5.0¶
Petastorm 0.5.0 has some breaking changes from previous versions. These include:
- Users should use
make_reader(), instead of instantiatingReaderdirectly to create a new instances - It is still possible (although discouraged in most cases) to instantitate
Reader. Some of its argument has changed.
Use make_reader() to instantiate a reader instance¶
Use make_reader() to create a new instance of a reader. make_reader()
takes arguments that are almost similar to constructor arguments of Reader. The following
list enumerates the differences:
reader_pool_type: takes one of the strings:'thread','process','dummy'(instead ofThreadPool(),ProcessPool()andDummyPool()object instances). Pass number of workers usingworkers_countargument.training_partitionandnum_training_partitionswere renamed intocur_shardandshard_count.shuffleandshuffle_optionswere replaced byshuffle_row_groups=True, shuffle_row_drop_partitions=1
from petastorm.reader import Reader
reader = Reader(dataset_url,
reader_pool=ThreadPool(5),
training_partition=1, num_training_partitions=5,
shuffle_options=ShuffleOptions(shuffle_row_groups=False))
To:
from petastorm import make_reader
reader = make_reader(dataset_url,
reader_pool_type='thread',
workers_count=5,
cur_shard=1, shard_count=5,
shuffle_row_groups=False)