Skip to content

Conversation

@Meghajit
Copy link
Member

Closes #189

@Meghajit Meghajit self-assigned this Aug 12, 2022
@Meghajit Meghajit changed the title docs: create new documentation images docs: update documentation to account for parquet source Aug 12, 2022
@Meghajit Meghajit marked this pull request as ready for review August 16, 2022 11:19
@Meghajit Meghajit added the documentation Improvements or additions to documentation label Aug 16, 2022
### Components

![Dagger Architecture](/img/architecture.png)
![Dagger Architecture](/img/system_design/dagger_system_design.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this path is broken

Copy link
Member Author

@Meghajit Meghajit Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prakharmathur82 But the image is getting rendered in the page. Do you mean you are not able to open the image via the link ?

Screenshot 2022-08-18 at 11 58 47 AM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, images don't get rendered on the markdown page for some reason. It was happening earlier also. You can check the master branch: https://github.com/odpf/dagger/blob/main/docs/docs/concepts/architecture.md

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the relative paths
Fixed via commit 2de8360

Architecturally after the creation of Dagger, it goes through several stages before materializing the results to an output stream.

![](/img/dagger-lifecycle.png)
![](/img/lifecycle/dagger_lifecycle.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken

Copy link
Member Author

@Meghajit Meghajit Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image is getting rendered though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the relative paths
Fixed via commit 2de8360

processing and analysis on streaming data.

![](/img/overview.svg)
![](/img/overview/dagger_overview.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image is getting rendered though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the relative paths
Fixed via commit 2de8360

@@ -0,0 +1 @@
<mxfile host="app.diagrams.net" modified="2022-08-12T14:14:44.485Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36" etag="ofOYtoHlb1JWbJ9Jaj7X" version="20.2.3" type="device"><diagram id="Q19rsmPLgT0uxxymODiW" name="Page-1">7V3bdps4FP0aPyYLxP2xuU1nVrvGbdI17bwptmLTYuQB3Dj9+hFGwuiCwRcBTpqH1hwjjM/eOjpnS8Ij63qx/iOBy/lHPEXRCBjT9ci6GQEAjCAg/+WWl8LiOX5hmCXhtDCZW8N9+AtRo0Gtq3CKUu7EDOMoC5e8cYLjGE0yzgaTBD/zpz3hiP/UJZwhyXA/gZFs/SecZvPC6gNva3+PwtmcfbLp0i+8gOxk+k3SOZzi54rJuh1Z1wnGWfFqsb5GUe485pei3V3Nu+WNJSjO2jT492HsZwt09/mv/76v7R9f3oHxwwXwreI6P2G0ol+Z3m72wnyQ4FU8RflljJF19TwPM3S/hJP83WeCOrHNs0VEjkzyMoKPKLqCkx+zTbNrHOGEvBXjmJx/9YTjjIJsAnrMThkB6+7OIH/EnmYJ/oGExoWRwZB/2FMYRVzz/I/Y6XdCSYbWte4ySxAIexFeoCx5IafQBhclcJS5pkePn7c8MAEzziskcB1qhJR8s/LqW3zICwrRHnCZli+hg6aEsPQQJ9kcz3AMo9ut9YrHb3vOB4yX1JHfUZa9UGDgKsM8pmgdZl/z5peeQw+/Vd66WdNLbw5e2EFMvvHX6sG3zSWAw4637TZHrKHIERl2nh0EeMsKggrwuUt2w048iFfJBO3wteXScAOTGcp2nOgFaiIlKIJZ+JO/k9NTwjZ6oUQNut7x6E5hOt/c2g7qGTup1wk/yvignyC06RiH5Ca3EcrlA5QDhLBTfAXaSqBZeRuHM48Rb9fQgeLpu3wU3gZxDt0KoXL7GGYZSuKNBRiWxA5LYu6WhpdOlYhmAwu3tDpbVhXxSWZVZXhyFIMTsx1LPtMQ2Oe2Y1/zlWzxSoUrtPHYCyQe32cJggti+whjkuElx6VEYsg7QXZighqfVdAvqVSF39eXmrj9piaHZCaVZORb9c1hZyZ+cBaZiS/3qy83d+ng+lLAdyUrkLtSoOhJlr6e5PXRkwZC7oBV+cMmN7vNCrmvcfwUzlYJ+XQcD47lYjnbP81N2YWS13IRYVn77anoAx/Z6ca+XrHsgPNKOT5Wi3xL4ZZAn1tAs1v0k2kXZDtzzq69JetXDwmM0yecLFDSx2Czn+dKoZSjodN737Qlt44TNAIuXOQ+ih/T/L9xgicoTbHCz8RDGe9MpaRXFfCoCUbhLC8BJ8SjJPO2rnJ/hxMYvaNvLMLpdDPkqdDj8e0QQMABaDoSgL7bKYB+/8HVtXlWDyG4yjnpcEKEP7TgyiAffHCt8dxAgyuQE58xTnNJ5E0F1N2gmTxoioAa+J2C5jZHjoraOYlgmoaTDUwwyWRzBcjdLpbrMKZ5sNeF8s70j/1mY5rk+k5qvjIsNxZ9plNT9XWjegaCVFlO/+0revoWfyELOPyFNGuewFSw981oDCabvm/mm2Wp+dbR5B670ao2HcY/iOXvZDJHxFswI54amtjgOTy9beXcuWrqXJ+oJheKhSfFXhBF4TKtGw0rboPpslhw8hSuc1drkSZFPypkfl/hRn0qv91ivYhyGDxivONGUJrUnDAaBDWJyJFjhSVUP7YIStuxwhXnx8QLaR4rTFsOQiPgRhkFj7yeZRtIClvIDDFpE68Wj/n8mYGfyD/FvFrKziS3E4qtiY2/6FnkwcI6JsOg65hOEAMuyoVHJf5yLux2WsDYLarok+TC3GKBFonxgasBTpF0NKcSRcrVV+pqM0nv2HBki6mrqD/oDkeOrEoMjX026SFcmRX4QQMHN0djlITEP3l06pKYvdZULqvjS3nmQGJ6woBricWZdmI6Mg/7W4l3JsssGfrNNC1qod4mfGVB/f7TB2L4tEKb22AJzGPCcpfbNZqslJPB6Rwu85dphvLcYlnp9rmpEgYai7e8+GAw6ilCnBZrjZTFnL4qxGkxdXz6KqSuN2iuT0CR7w23QHGsngsUp4eiVC3CertVWC6j6CS8alp7bInChJgEHsodaRGzdu7Iuv7DZs4TGKZEojdXfNoizorJs25rT0cWIxhe8jj75vFyFPNmHeMlawW363yVP8zb3sAMbnQhnKDhrQz1xQUMgezNjmVsv0Xxq3tdx4UpLJk1TUVGqNKlLW08A21WdhzvmF2gyP564fPgqndU+waBrcs7VovJ654Wb4AW6kPH3jLlPnYDZ5ttIMZ9kXCd1Hl148FRu1v3czYbPgQFRbHoQtWrHU1AKObsSiBuUEqqZDJM/zr19pzBwqHagazEw7vUhYjpKiqrLjWuyyFt2Nk9EjXvMKYoNW/163URArvNSi/8M16u8iLufjJHC/hWup/RrvvpG8UVEuTD59t3H8+oVtWHllD5AEVOqkoj9FU+vixSlvtKzwUuHaXqYKQEH9QD9Ja1hKFoB8DoZSPkoasI2mvGw3vqgN02FQG2fWQuUrfIyeHCt9fyYRaNF3I9KRvWvra2153wh60KPzXVa9ZadPPkHqstmY+eUq7hIE9Bq+M5DVIonjsBz+RJDCzHHArTpCl33UxT7BCTqNe3ii/J1Y5qAUO3EmOZNA0jtTmT7tZ+TRvpCUf2t+PgVWzxOz8Jualb2XzwcRSzY53KxeUm7FegF+/pe8/qXRomF/8d0g4IaXbrkAb6DWnyIyHOTgbes1e5ikShU8UXAFl7LyXfc5GoToKMoO56iqkwti2mG21KsTNznGASnYDxHsbEV+c/ygjrfkHbJ0bpG2HkCZBX7nPX7T2nCiSfnrUGWyNMHSJedJQggNYJgp6NrOJzhj3x8ebtpVn+OuJGH91ihf3KphNqqHxK8tltydfz7pzyRpUF9+sp/DxB8HMUawu7rvvsfqY7NG5762/qgk6vtZqI67cctOVy8LXlYo6oG/vmJeg7H7NfwWqs/dzutnxgm74q3Jd9LnmZbemcwgym+XaGZl8f9VM5JRJ8XJqiJ7ja7EPt9NdyxIcCqB74o9ooIT726oSY7fe8Nup0LftD248PzVH/917QJtxVD6v53Vd39FXHkMe0E/VVcrj9gbMC4u3PxFm3/wM=</diagram></mxfile> No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is source connected to proto handler?
why call it proto handler?
again check for pre processor workflow
same for name of proto handler towards sink

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is source connected to proto handler?

Hmm. I guess the source should only be connected to the deserializer. The deserializer and proto handler work together to parse raw data ( parquet records, kafka records) into Row. Will fix this

why call it proto handler?

Yes, I should have called it Type Handler. It was a mistake during copy paste from the earlier diagram. I will edit the name.

again check for pre processor workflow

Will do this.

same for name of proto handler towards sink

Will change it to Type Handler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the diagrams
Fixed via commit 7242b10

README.md Outdated
Dagger or Data Aggregator is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data. With Dagger, you don't need to write custom applications or manage resources to process data in real-time.
Instead, you can write SQLs to do the processing and analysis on streaming data.
Dagger or Data Aggregator is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink
for stateful processing of both real time and historical streaming data. With Dagger, you don't need to write custom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about real-time streaming and historical data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prakharmathur82 We are able to process both real time + historical batched data ( parquet files) as a stream. Hence, put the streaming keyword after. Does the below look ok ?

Dagger or Data Aggregator is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of streaming data, both real time and historical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just say "stateful processing of data"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can say stateful processing of data

Copy link
Member Author

@Meghajit Meghajit Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool 👍
will do the change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated via commit ab4a272


Dagger or Data Aggregator is a cloud native framework for processing real-time streaming data built on top of Apache Flink.
Dagger or Data Aggregator is a cloud native framework built on top of Apache Flink for processing both real-time and
historical streaming data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented above for the same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed via commit f7bb8b0

You can look into the official [GCS Hadoop Connectors](https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/master/gcs/CONFIGURATION.md)
documentation to know more on how to edit this xml as per your needs.
Dagger requires configuring a source from where data will be streamed for processing. Please check
[here](./guides/choose_source.md) for the different available data sources.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link is broken

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed via commit 5accebb


The following section describes how to manage Dagger throughout its lifecycle.

### [Choosing a Dagger Source](./guides/choose_source.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link is broken

Copy link
Member Author

@Meghajit Meghajit Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the overview page links are broken even in master branch.

Fixed via commit 49eca41

@ravisuhag
Copy link
Member

@Meghajit can you cross check if docs build is passing by running yarn build in /docs folder.

@Meghajit
Copy link
Member Author

@Meghajit can you cross check if docs build is passing by running yarn build in /docs folder.

Screenshot 2022-08-23 at 11 32 08 AM

@ravisuhag Yes, it passed

@prakharmathur82 prakharmathur82 merged commit 6133f7f into raystack:main Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

docs: update documentation images and other sections to account for Parquet Source

3 participants