Architecture
A question mark (?) after a word, phrase, or diagram component means that item is optional.
OpenScraping Context#
The below diagram illustrates a common example of how OpenScraping is used in a broader context/on a macro scale.
graph TB
subgraph A [Client]
end
A -->|Create Job| B[OpenScraping Job API]
subgraph Scraper
B -->|Distributes Job to an Agent| C[Agent]
C -->|Run, and store in?| D[(Data Store)]
end
A -->|Retrieve Data| D
A -->|Check status| B
B -->|Notify Job completed, with data?| A
Agent Execution#
This diagram shows a visual representation of the rough execution flow of any given OSP agent.
graph TB
B[Job Service] -->|Fetch job for agent| C[Requestor]
subgraph Agent
C --> D[Matcher 1]
C --> K[Matcher 2]
K --> J[No Match, end]
C --> L[... Matcher N]
L --> J[No Match, end]
subgraph Matcher Processing
D --> E[Parser?]
E --> F[Extractor]
F --> G[Generator?]
end
end
G -->|Store relevant data| H[Data Service]
The Requestor accesses the network. TODO: Matchers are run in parallel to improve performance. Each Matcher Processing operation is run sequentially and synchronously on that Matcher's thread.