Architecture

A question mark (?) after a word, phrase, or diagram component means that item is optional.

OpenScraping Context#

The below diagram illustrates a common example of how OpenScraping is used in a broader context/on a macro scale.

graph TB subgraph A [Client] end A -->|Create Job| B[OpenScraping Job API] subgraph Scraper B -->|Distributes Job to an Agent| C[Agent] C -->|Run, and store in?| D[(Data Store)] end A -->|Retrieve Data| D A -->|Check status| B B -->|Notify Job completed, with data?| A

Agent Execution#

This diagram shows a visual representation of the rough execution flow of any given OSP agent.

graph TB B[Job Service] -->|Fetch job for agent| C[Requestor] subgraph Agent C --> D[Matcher 1] C --> K[Matcher 2] K --> J[No Match, end] C --> L[... Matcher N] L --> J[No Match, end] subgraph Matcher Processing D --> E[Parser?] E --> F[Extractor] F --> G[Generator?] end end G -->|Store relevant data| H[Data Service]

The Requestor accesses the network. TODO: Matchers are run in parallel to improve performance. Each Matcher Processing operation is run sequentially and synchronously on that Matcher's thread.