externalDataset <Data>
Index
Properties
externalclient
externalreadonlyconfig
externalid
externallog
externaloptionalname
externaloptionalreadonlystorageObject
Methods
externaldrop
- Removes the dataset either from the Apify cloud storage or from the local directory, depending on the mode of operation. - Returns Promise<void>
externalexport
- Returns all the data from the dataset. This will iterate through the whole dataset via the - listItems()client method, which gives you only paginated results.- Parameters- externaloptionaloptions: DatasetExportOptions
 - Returns Promise<Data[]>
externalexportTo
- Save the entirety of the dataset's contents into one file within a key-value store. - Parameters- externalkey: string- The name of the value to save the data in. 
- externaloptionaloptions: DatasetExportToOptions- An optional options object where you can provide the dataset and target KVS name. 
- externaloptionalcontentType: string- Only JSON and CSV are supported currently, defaults to JSON. 
 - Returns Promise<Data[]>
externalexportToCSV
- Save entire default dataset's contents into one CSV file within a key-value store. - Parameters- externalkey: string- The name of the value to save the data in. 
- externaloptionaloptions: Omit<DatasetExportToOptions, fromDataset>- An optional options object where you can provide the target KVS name. 
 - Returns Promise<void>
externalexportToJSON
- Save entire default dataset's contents into one JSON file within a key-value store. - Parameters- externalkey: string- The name of the value to save the data in. 
- externaloptionaloptions: Omit<DatasetExportToOptions, fromDataset>- An optional options object where you can provide the target KVS name. 
 - Returns Promise<void>
externalforEach
- Iterates over dataset items, yielding each in turn to an - iterateefunction. Each invocation of- iterateeis called with two arguments:- (item, index).- If the - iterateefunction returns a Promise then it is awaited before the next call. If it throws an error, the iteration is aborted and the- forEachfunction throws the error.- Example usage - const dataset = await Dataset.open('my-results');
 await dataset.forEach(async (item, index) => {
 console.log(`Item at ${index}: ${JSON.stringify(item)}`);
 });- Parameters- externaliteratee: DatasetConsumer<Data>- A function that is called for every item in the dataset. 
- externaloptionaloptions: DatasetIteratorOptions- All - forEach()parameters.
- externaloptionalindex: number- Specifies the initial index number passed to the - iterateefunction.
 - Returns Promise<void>
externalgetData
- Returns DatasetContent object holding the items in the dataset based on the provided parameters. - Parameters- externaloptionaloptions: DatasetDataOptions
 - Returns Promise<DatasetContent<Data>>
externalgetInfo
- Returns an object containing general information about the dataset. - The function returns the same object as the Apify API Client's getDataset function, which in turn calls the Get dataset API endpoint. - Example: - {
 id: "WkzbQMuFYuamGv3YF",
 name: "my-dataset",
 userId: "wRsJZtadYvn4mBZmm",
 createdAt: new Date("2015-12-12T07:34:14.202Z"),
 modifiedAt: new Date("2015-12-13T08:36:13.202Z"),
 accessedAt: new Date("2015-12-14T08:36:13.202Z"),
 itemCount: 14,
 }- Returns Promise<undefined | DatasetInfo>
externalmap
- Produces a new array of values by mapping each value in list through a transformation function - iteratee(). Each invocation of- iteratee()is called with two arguments:- (element, index).- If - iterateereturns a- Promisethen it's awaited before a next call.- Parameters- externaliteratee: DatasetMapper<Data, R>
- externaloptionaloptions: DatasetIteratorOptions- All - map()parameters.
 - Returns Promise<R[]>
externalpushData
- Stores an object or an array of objects to the dataset. The function returns a promise that resolves when the operation finishes. It has no result, but throws on invalid args or other errors. - IMPORTANT: Make sure to use the - awaitkeyword when calling- pushData(), otherwise the crawler process might finish before the data is stored!- The size of the data is limited by the receiving API and therefore - pushData()will only allow objects whose JSON representation is smaller than 9MB. When an array is passed, none of the included objects may be larger than 9MB, but the array itself may be of any size.- The function internally chunks the array into separate items and pushes them sequentially. The chunking process is stable (keeps order of data), but it does not provide a transaction safety mechanism. Therefore, in the event of an uploading error (after several automatic retries), the function's Promise will reject and the dataset will be left in a state where some of the items have already been saved to the dataset while other items from the source array were not. To overcome this limitation, the developer may, for example, read the last item saved in the dataset and re-attempt the save of the data from this item onwards to prevent duplicates. - Parameters- externaldata: Data | Data[]- Object or array of objects containing data to be stored in the default dataset. The objects must be serializable to JSON and the JSON representation of each object must be smaller than 9MB. 
 - Returns Promise<void>
externalreduce
- Reduces a list of values down to a single value. - The first element of the dataset is the initial value, with each successive reductions should be returned by - iteratee(). The- iteratee()is passed three arguments: the- memo,- valueand- indexof the current element being folded into the reduction.- The - iterateeis first invoked on the second element of the list (- index = 1), with the first element given as the memo parameter. After that, the rest of the elements in the dataset is passed to- iteratee, with the result of the previous invocation as the memo.- If - iteratee()returns a- Promiseit's awaited before a next call.- If the dataset is empty, reduce will return undefined. - Parameters- externaliteratee: DatasetReducer<Data, Data>
 - Returns Promise<undefined | Data>
staticexternalexportToCSV
- Save entire default dataset's contents into one CSV file within a key-value store. - Parameters- externalkey: string- The name of the value to save the data in. 
- externaloptionaloptions: DatasetExportToOptions- An optional options object where you can provide the dataset and target KVS name. 
 - Returns Promise<void>
staticexternalexportToJSON
- Save entire default dataset's contents into one JSON file within a key-value store. - Parameters- externalkey: string- The name of the value to save the data in. 
- externaloptionaloptions: DatasetExportToOptions- An optional options object where you can provide the dataset and target KVS name. 
 - Returns Promise<void>
staticexternalgetData
- Returns DatasetContent object holding the items in the dataset based on the provided parameters. - Parameters- externaloptionaloptions: DatasetDataOptions
 - Returns Promise<DatasetContent<Data>>
staticexternalopen
- Opens a dataset and returns a promise resolving to an instance of the Dataset class. - Datasets are used to store structured data where each object stored has the same attributes, such as online store products or real estate offers. The actual data is stored either on the local filesystem or in the cloud. - For more details and code examples, see the Dataset class. - Parameters- externaloptionaldatasetIdOrName: null | string- ID or name of the dataset to be opened. If - nullor- undefined, the function returns the default dataset associated with the crawler run.
- externaloptionaloptions: StorageManagerOptions- Storage manager options. 
 - Returns Promise<Dataset<Data>>
The
Datasetclass represents a store for structured data where each object stored has the same attributes, such as online store products or real estate offers. You can imagine it as a table, where each object is a row and its attributes are columns. Dataset is an append-only storage - you can only add new records to it but you cannot modify or remove existing records. Typically it is used to store crawling results.Do not instantiate this class directly, use the Dataset.open function instead.
Datasetstores its data either on local disk or in the Apify cloud, depending on whether theAPIFY_LOCAL_STORAGE_DIRorAPIFY_TOKENenvironment variables are set.If the
APIFY_LOCAL_STORAGE_DIRenvironment variable is set, the data is stored in the local directory in the following files:Note that
{DATASET_ID}is the name or ID of the dataset. The default dataset has ID:default, unless you override it by setting theAPIFY_DEFAULT_DATASET_IDenvironment variable. Each dataset item is stored as a separate JSON file, where{INDEX}is a zero-based index of the item in the dataset.If the
APIFY_TOKENenvironment variable is set butAPIFY_LOCAL_STORAGE_DIRnot, the data is stored in the Apify Dataset cloud storage. Note that you can force usage of the cloud storage also by passing theforceCloudoption to Dataset.open function, even if theAPIFY_LOCAL_STORAGE_DIRvariable is set.Example usage: