Telegraph: An Adaptive Global-Scale Query Engine
Scenarios
There’s a Data Flood Coming
There’s a Data Flood Coming
The Telegraph Query Engine
CONTROL Continuous Output, Navigation & Transformation with Refinement On Line
CONTROL Continuous Output and Navigation Technology with Refinement On Line
CONTROL Continuous Output and Navigation Technology with Refinement On Line
River
Eddy
Telegraph: Putting it Together
Integration with Endeavour
Connectivity & Heterogeneity
Potter’s Wheel Anomaly Detection
413.50K
Категория: Базы данныхБазы данных

Telegraph: An Adaptive Global-Scale Query Engine

1. Telegraph: An Adaptive Global-Scale Query Engine

Telegraph: An Adaptive GlobalScale Query Engine
Joe Hellerstein

2. Scenarios

• Ubiquitous computing: more than clients!
– sensors and their data feeds are key
• smart dust, biomedical (MEMS sensors)
• each consumer good records (mis)use
– disposable computing
• video from surveillance cameras, broadcasts, etc.
• Global Data Federation
– all the data is online – what are we waiting for?
– The plumbing is coming
• XML/HTTP, etc. give LCD communication
• but how do you query robustly over many sites in the wide
area?

3. There’s a Data Flood Coming

4. There’s a Data Flood Coming

• What does it look like?
– Never ends: interactivity required
– Big: data reduction/aggregation is key
– Unpredictable: this scale of devices and nets
will not behave nicely

5. The Telegraph Query Engine

• Key technologies
– Interactive Control
• interactivity with early answers
• online aggregation for data reduction
– Continuously adaptive flow optimization
• massively parallel, adaptive dataflow via
Rivers and Eddies

6. CONTROL Continuous Output, Navigation & Transformation with Refinement On Line

CONTROL
Continuous Output, Navigation & Transformation with Refinement On Line
• Data-intensive jobs are long-running. How to
give early answers and interactivity?
– online interactivity over feeds: data “juggle”
– online query processing algs: ripple joins
– statistical estimators, and their performance
implications
• Appreciate interplay of massive data
processing, stats, and UIs

7. CONTROL Continuous Output and Navigation Technology with Refinement On Line

8. CONTROL Continuous Output and Navigation Technology with Refinement On Line

9. River

• We built the world’s fastest sorting machine
– On the “NOW”: 100 Sun workstations + SAN
– But it only beat the record under ideal
conditions!
• River: performance adaptivity for data flows on
clusters
– simplifies management and programming
– perfect for sensor-based streams

10. Eddy

Eddy
• How to order and reorder operators over time
– based on performance, economic/admin feedback
• Vs.River:
– River optimizes each operator “horizontally”
– Eddies optimize a pipeline “vertically”

11. Telegraph: Putting it Together

• Scalable, adaptive dataflow infrastructure. Apps include…
– sensor nets
– massively parallel and wide-area query engines
– net appliances: chaining xform8n/aggreg8n/etc. proxies
– any unpredictable dataflow scenario
• Technology: a marriage of…
– CONTROL, River & Eddy
• Many research questions here
• E.g. how to combine River and Eddy adaptivity
• E.g. how to tune Eddies for statistical performance goals
– Combinations of browse/query/mine at UI
– Storage management to handle new hardware realities

12. Integration with Endeavour

• Give
– Be data-intensive backbone to diverse clients
– Be replication dataflow engine for OceanStore
– Telegraph Storage Manager provides storage
(xactional/otherwise) for OceanStore
– Provide platform for data-intensive “tacit info mining”
• Take
– Leverage OceanStore to manager distributed metadata,
security
– Leverage protocols out of TinyOS for sensors

13.

Additional Slides
• For use in questions, etc.

14. Connectivity & Heterogeneity

Connectivity & Heterogeneity
• Lots of folks working on data format translation, parsing
– we will borrow, not build
– currently using JDBC & Cohera Net Query
• commercial tool, donated by Cohera Corp.
• gateways XML/HTML (via http) to ODBC/JDBC
– we may write “Teletalk” gateways from sensors
• Heterogeneity
– never a simple problem
– Control project developed interactive, online data
transformation tool: Potter’s Wheel

15.

16. Potter’s Wheel Anomaly Detection

English     Русский Правила