ray.data.datasource.ParquetMetadataProvider.prefetch_file_metadata
ray.data.datasource.ParquetMetadataProvider.prefetch_file_metadata#
- ParquetMetadataProvider.prefetch_file_metadata(pieces: List[pyarrow.dataset.ParquetFileFragment], **ray_remote_args) Optional[List[Any]][source]#
Pre-fetches file metadata for all Parquet file fragments in a single batch.
Subsets of the metadata returned will be provided as input to subsequent calls to
_get_block_metadata()together with their corresponding Parquet file fragments.Implementations that don’t support pre-fetching file metadata shouldn’t override this method.
- Parameters
pieces – The Parquet file fragments to fetch metadata for.
- Returns
Metadata resolved for each input file fragment, or
None. Metadata must be returned in the same order as all input file fragments, such thatmetadata[i]always contains the metadata forpieces[i].