Data Collector Dashboard

mceclip1.png

Any collector you create using a template or a custom collector will appear on your Data Collector dashboard.

Dashboard overview :

- Free trial : As part of the 7-day free trial, you’re entitled to 1,000-page loads
- Update available : A new version of the collector is available. If there is no update button, you have the latest version.
- Properties : This is where you’ll see all the collector properties. Learn more
- Delivery preferences : Choose your desired file format, delivery method, and notification settings. Learn more
- Output configuration/ Schema : Here, you can go back to edit your output definitions. Learn more

Initiate collector and get collection results

Initiate Collector

To start collecting the data, you have three options:

A. Initiate by API
B. Initiate manually
C. Schedule a collector
mceclip0.png

Get collection results

Once the data collection is completed, click the “three dots” icon and select “Statistics” to access the results and download the data.

* Note : Realtime job input and output cannot be downloaded since it is not stored on our end

mceclip1.png

mceclip2.png

Statistics

The statistics page presents essential information about the success of the data collection. Below is a list of all the terms included in the statistics table:

mceclip0.png

  • Job ID - The unique id of the collection
  • Trigger - The person who initiated the data collection and how (API, manually or scheduled)
  • Inputs - The number of inputs inserted into the collection
  • Records - The number of results collected
  • Failed - The number of pages failed to be crawled
  • Success rate - The percentage of the results that were successfully collected
  • Queued at - The queue timestamp
  • Started at - The date and time when the collector began collecting
  • Finished at - The date and time when the collector finished collecting
  • Job time - The length of time it took to complete
  • Estimated time left - The amount of time left until collection is complete
  • Queue - The name of the job given in the trigger behavior (Queue name)
  • Usage - The total amount of page loads used

Statistics actions menu

mceclip2.png

  • 3 dots

Here you can perform different functions with the data collection job: The statistics page presents essential information about the success of the data collection. Below is a list of all the terms included in the statistics table:

mceclip0.png

  • Job ID - The unique id of the collection
  • Trigger - The person who initiated the data collection and how (API, manually or scheduled)
  • Inputs - The number of inputs inserted into the collection
  • Records - The number of results collected
  • Failed - The number of pages failed to be crawled
  • Success rate - The percentage of the results that were successfully collected
  • Queued at - The queue timestamp
  • Started at - The date and time when the collector began collecting
  • Finished at - The date and time when the collector finished collecting
  • Job time - The length of time it took to complete
  • Estimated time left - The amount of time left until collection is complete
  • Queue - The name of the job given in the trigger behavior (Queue name)
  • Usage - The total amount of page loads used

Statistics actions menu

mceclip2.png

  • 3 dots

    Here you can perform different functions with the data collection job:

    • Redeliver results - resend the results of the collector to your pre-defined location - Datasets: only resend datasets.
      - Datasets & Media: resend datasets and pictures/videos.
    • Rerun job - Initiate the same job again using the same inputs. - Rerun all from the beginning
      - Rerun failed pages
    • Quick view - preview the results without downloading them. - Live streaming: view the results live as they are collected.
    • Crawl log - review the logs created by the collector.
    • Errors - shows only the errors that occurred in the collection.
    • Activity - displays crawler activity and errors in a graph.
    • Download input file - you can download the inputs you added to the job.
    • Results are wrong - report incorrect results.
  • Download results