Skip to content

Commit da76504

Browse files
authored
Merge pull request #4846 from StackStorm/optimize_escaped_dict_fields
Optimize storage (serialization and de-serilization) of very large dictionaries inside MongoDB
2 parents 4022aea + 3b47856 commit da76504

File tree

84 files changed

+14693
-211
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+14693
-211
lines changed

.github/workflows/ci.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ jobs:
4040
- name: 'Unit Tests'
4141
task: 'ci-unit'
4242
python-version: '3.6'
43+
# This job is slow so we only run in on a daily basis
4344
# - name: 'Micro Benchmarks'
4445
# task: 'micro-benchmarks'
4546
# python-version: '3.6'

CHANGELOG.rst

Lines changed: 138 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,10 @@ Added
1717

1818
Contributed by @anirudhbagri.
1919

20-
Fixed
21-
~~~~~
20+
* Various additional metrics have been added to the action runner service to provide for better
21+
operational visibility. (improvement) #4846
2222

23-
* Refactor spec_loader util to use yaml.load with SafeLoader. (security)
24-
Contributed by @ashwini-orchestral
23+
Contributed by @Kami.
2524

2625
Changed
2726
~~~~~~~
@@ -37,12 +36,142 @@ Changed
3736

3837
Contributed by @Kami and @shital.
3938

39+
* Add new ``-x`` argument to the ``st2 execution get`` command which allows
40+
``result`` field to be excluded from the output. (improvement) #4846
41+
42+
* Update ``st2 execution get <id>`` command to also display execution ``log`` attribute which
43+
includes execution state transition information.
44+
45+
By default ``end_timestamp`` attribute and ``duration`` attribute displayed in the command
46+
output only include the time it took action runner to finish running actual action, but it
47+
doesn't include the time it it takes action runner container to fully finish running the
48+
execution - this includes persisting execution result in the database.
49+
50+
For actions which return large results, there could be a substantial discrepancy - e.g.
51+
action itself could finish in 0.5 seconds, but writing data to the database could take
52+
additional 5 seconds after the action code itself was executed.
53+
54+
For all purposes until the execution result is persisted to the database, execution is
55+
not considered as finished.
56+
57+
While writing result to the database action runner is also consuming CPU cycles since
58+
serialization of large results is a CPU intensive task.
59+
60+
This means that "elapsed" attribute and start_timestamp + end_timestamp will make it look
61+
like actual action completed in 0.5 seconds, but in reality it took 5.5 seconds (0.5 + 5 seconds).
62+
63+
Log attribute can be used to determine actual duration of the execution (from start to
64+
finish). (improvement) #4846
65+
66+
Contributed by @Kami.
67+
68+
* Various internal improvements (reducing number of DB queries, speeding up YAML parsing, using
69+
DB object cache, etc.) which should speed up pack action registration between 15-30%. This is
70+
especially pronounced with packs which have a lot of actions (e.g. aws one).
71+
(improvement) #4846
72+
73+
Contributed by @Kami.
74+
75+
* Underlying database field type and storage format for the ``Execution``, ``LiveAction``,
76+
``WorkflowExecutionDB``, ``TaskExecutionDB`` and ``TriggerInstanceDB`` database models has
77+
changed.
78+
79+
This new format is much faster and efficient than the previous one. Users with larger executions
80+
(executions with larger results) should see the biggest improvements, but the change also scales
81+
down so there should also be improvements when reading and writing executions with small and
82+
medium sized results.
83+
84+
Our micro and end to benchmarks have shown improvements up to 15-20x for write path (storing
85+
model in the database) and up to 10x for the read path.
86+
87+
To put things into perspective - with previous version, running a Python runner action which
88+
returns 8 MB result would take around ~18 seconds total, but with this new storage format, it
89+
takes around 2 seconds (in this context, duration means the from the time the execution was
90+
scheduled to the time the execution model and result was written and available in the database).
91+
92+
The difference is even larger when working with Orquesta workflows.
93+
94+
Overall performance improvement doesn't just mean large decrease in those operation timings, but
95+
also large overall reduction of CPU usage - previously serializing large results was a CPU
96+
intensive time since it included tons of conversions and transformations back and forth.
97+
98+
The new format is also around 10-20% more storage efficient which means that it should allows
99+
for larger model values (MongoDB document size limit is 16 MB).
100+
101+
The actual change should be fully opaque and transparent to the end users - it's purely a
102+
field storage implementation detail and the code takes care of automatically handling both
103+
formats when working with those object.
104+
105+
Same field data storage optimizations have also been applied to workflow related database models
106+
which should result in the same performance improvements for Orquesta workflows which pass larger
107+
data sets / execution results around.
108+
109+
Trigger instance payload field has also been updated to use this new field type which should
110+
result in lower CPU utilization and better throughput of rules engine service when working with
111+
triggers with larger payloads.
112+
113+
This should address a long standing issue where StackStorm was reported to be slow and CPU
114+
inefficient with handling large executions. (improvement) #4846
115+
116+
Contributed by @Kami.
117+
118+
* Add new ``result_size`` field to the ``ActionExecutionDB`` model. This field will only be
119+
populated for executions which utilize new field storage format.
120+
121+
It holds the size of serialzed execution result field in bytes. This field will allow us to
122+
implement more efficient execution result retrieval and provide better UX since we will be
123+
able to avoid loading execution results in the WebUI for executions with very large results
124+
(which cause browser to freeze). (improvement) #4846
125+
126+
Contributed by @Kami.
127+
128+
* Add new ``/v1/executions/<id>/result[?download=1&compress=1&pretty_format=1]`` API endpoint
129+
which can be used used to retrieve or download raw execution result as (compressed) JSON file.
130+
131+
This endpoint will primarily be used by st2web when executions produce very large results so
132+
we can avoid loading, parsing and formatting those very large results as JSON in the browser
133+
which freezes the browser window / tab. (improvement) #4846
134+
135+
Contributed by @Kami.
136+
137+
Improvements
138+
~~~~~~~~~~~~
139+
140+
* CLI has been updated to use or ``orjson`` when parsing API response and C version of the YAML
141+
safe dumper when formatting execution result for display. This should result in speed up when
142+
displaying execution result (``st2 execution get``, etc.) for executions with large results.
143+
144+
When testing it locally, the difference for execution with 8 MB result was 18 seconds vs ~6
145+
seconds. (improvement) #4846
146+
147+
Contributed by @Kami.
148+
149+
* Update various Jinja functiona to utilize C version of YAML ``safe_{load,dump}`` functions and
150+
orjson for better performance. (improvement) #4846
151+
152+
Contributed by @Kami.
153+
154+
* For performance reasons, use ``udatetime`` library for parsing ISO8601 / RFC3339 date strings
155+
where possible. (improvement) #4846
156+
157+
Contributed by @Kami.
158+
159+
Fixed
160+
~~~~~
161+
162+
* Refactor spec_loader util to use yaml.load with SafeLoader. (security)
163+
Contributed by @ashwini-orchestral
164+
40165
* Import ABC from collections.abc for Python 3.10 compatibility. (#5007)
41166
Contributed by @tirkarthi
42167

43168
* Updated to use virtualenv 20.4.0/PIP20.3.3 and fixate-requirements to work with PIP 20.3.3 #512
44169
Contributed by Amanda McGuinness (@amanda11 Ammeon Solutions)
45170

171+
* Fix ``st2 execution get --with-schema`` flag. (bug fix) #4846
172+
173+
Contributed by @Kami.
174+
46175
3.4.0 - March 02, 2021
47176
----------------------
48177

@@ -60,8 +189,9 @@ Added
60189
* Added st2-auth-ldap pip requirements for LDAP auth integartion. (new feature) #5082
61190
Contributed by @hnanchahal
62191

63-
* Added --register-recreate-virtualenvs flag to st2ctl reload to recreate virtualenvs from scratch.
64-
(part of upgrade instructions) [#5167]
192+
* Added --register-recreate-virtualenvs flag to st2ctl reload to recreate virtualenvs from
193+
scratch. (part of upgrade instructions) #5167
194+
65195
Contributed by @winem and @blag
66196

67197
Changed
@@ -186,6 +316,7 @@ Added
186316

187317
Changed
188318
~~~~~~~
319+
189320
* Switch to MongoDB ``4.0`` as the default version starting with all supported OS's in st2
190321
``v3.3.0`` (improvement) #4972
191322

@@ -208,6 +339,7 @@ Changed
208339

209340
Fixed
210341
~~~~~
342+
211343
* Fixed a bug where `type` attribute was missing for netstat action in linux pack. Fixes #4946
212344

213345
Reported by @scguoi and contributed by Sheshagiri (@sheshagiri)

Makefile

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,14 +63,19 @@ ifndef PYLINT_CONCURRENCY
6363
PYLINT_CONCURRENCY := 1
6464
endif
6565

66-
NOSE_OPTS := --rednose --immediate --with-parallel --nocapture
66+
# NOTE: We exclude resourceregistrar DEBUG level log messages since those are very noisy (we
67+
# loaded resources for every tests) which makes tests hard to troubleshoot on failure due to
68+
# pages and pages and pages of noise.
69+
# The minus in front of st2.st2common.bootstrap filters out logging statements from that module.
70+
# See https://nose.readthedocs.io/en/latest/usage.html#cmdoption-logging-filter
71+
NOSE_OPTS := --rednose --immediate --with-parallel --nocapture --logging-filter=-st2.st2common.bootstrap
6772

6873
ifndef NOSE_TIME
6974
NOSE_TIME := yes
7075
endif
7176

7277
ifeq ($(NOSE_TIME),yes)
73-
NOSE_OPTS := --rednose --immediate --with-parallel --with-timer --nocapture
78+
NOSE_OPTS := --rednose --immediate --with-parallel --with-timer --nocapture --logging-filter=-st2.st2common.bootstrap
7479
NOSE_WITH_TIMER := 1
7580
endif
7681

@@ -262,7 +267,7 @@ check-python-packages-nightly:
262267
done
263268

264269
.PHONY: ci-checks-nightly
265-
ci-checks-nightly: check-python-packages-nightly
270+
ci-checks-nightly: check-python-packages-nightly micro-benchmarks
266271

267272
.PHONY: checklogs
268273
checklogs:
@@ -522,6 +527,11 @@ micro-benchmarks: requirements .micro-benchmarks
522527
@echo
523528
@echo "==================== micro-benchmarks ===================="
524529
@echo
530+
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:fixture_file -s -v st2common/benchmarks/micro/test_mongo_field_types.py -k "test_save_large_execution"
531+
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:fixture_file -s -v st2common/benchmarks/micro/test_mongo_field_types.py -k "test_read_large_execution"
532+
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:fixture_file -s -v st2common/benchmarks/micro/test_mongo_field_types.py -k "test_save_multiple_fields"
533+
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:fixture_file -s -v st2common/benchmarks/micro/test_mongo_field_types.py -k "test_save_large_string_value"
534+
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:fixture_file -s -v st2common/benchmarks/micro/test_mongo_field_types.py -k "test_read_large_string_value"
525535
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:dict_keys_count_and_depth -s -v st2common/benchmarks/micro/test_fast_deepcopy.py -k "test_fast_deepcopy_with_dict_values"
526536
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:fixture_file -s -v st2common/benchmarks/micro/test_fast_deepcopy.py -k "test_fast_deepcopy_with_json_fixture_file"
527537
. $(VIRTUALENV_DIR)/bin/activate; pytest --benchmark-only --benchmark-name=short --benchmark-columns=min,max,mean,stddev,median,ops,rounds --benchmark-group-by=group,param:fixture_file,param:indent_sort_keys_tuple -s -v st2common/benchmarks/micro/test_json_serialization_and_deserialization.py -k "test_json_dumps"
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
mail-parser>=3.9.1,<3.10.0
1+
mail-parser==3.15.0

contrib/core/tests/test_action_sendmail.py

Lines changed: 11 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020
import tempfile
2121
import socket
2222

23-
import six
2423
import mock
2524
import mailparser
2625

@@ -126,20 +125,12 @@ def test_sendmail_utf8_subject_and_body(self):
126125
"attachments": "",
127126
}
128127

129-
if six.PY2:
130-
expected_body = (
131-
"Hello there 😃😃.\n"
132-
"<br><br>\n"
133-
"This message was generated by StackStorm action "
134-
"send_mail running on %s" % (HOSTNAME)
135-
)
136-
else:
137-
expected_body = (
138-
"Hello there \\U0001f603\\U0001f603.\n"
139-
"<br><br>\n"
140-
"This message was generated by StackStorm action "
141-
"send_mail running on %s" % (HOSTNAME)
142-
)
128+
expected_body = (
129+
"Hello there 😃😃.\n"
130+
"<br><br>\n"
131+
"This message was generated by StackStorm action "
132+
"send_mail running on %s" % (HOSTNAME)
133+
)
143134

144135
status, _, email_data, message = self._run_action(
145136
action_parameters=action_parameters
@@ -167,18 +158,11 @@ def test_sendmail_utf8_subject_and_body(self):
167158
"attachments": "",
168159
}
169160

170-
if six.PY2:
171-
expected_body = (
172-
"Hello there 😃😃.\n\n"
173-
"This message was generated by StackStorm action "
174-
"send_mail running on %s" % (HOSTNAME)
175-
)
176-
else:
177-
expected_body = (
178-
"Hello there \\U0001f603\\U0001f603.\n\n"
179-
"This message was generated by StackStorm action "
180-
"send_mail running on %s" % (HOSTNAME)
181-
)
161+
expected_body = (
162+
"Hello there 😃😃.\n\n"
163+
"This message was generated by StackStorm action "
164+
"send_mail running on %s" % (HOSTNAME)
165+
)
182166

183167
status, _, email_data, message = self._run_action(
184168
action_parameters=action_parameters
@@ -271,10 +255,6 @@ def _run_action(self, action_parameters):
271255
email_data = result["stdout"]
272256
email_data = email_data.split("\n")[:-2]
273257
email_data = "\n".join(email_data)
274-
275-
if six.PY2 and isinstance(email_data, six.text_type):
276-
email_data = email_data.encode("utf-8")
277-
278258
message = mailparser.parse_from_string(email_data)
279259
else:
280260
email_data = None
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
name: orquesta-data-flow-large-data
3+
description: A basic workflow which passes large JSON data around.
4+
runner_type: orquesta
5+
entry_point: workflows/orquesta-data-flow-large-data.yaml
6+
enabled: true
7+
parameters:
8+
file_path:
9+
type: string
10+
required: true
11+
description: "Path to the JSON fixture file to use."
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
name: python_runner_load_and_print_fixture
3+
description: Action which loads provided JSON fixture file, parses it and returns it as an action result. Useful when testing and benchmarking execution save timing.
4+
runner_type: "python-script"
5+
enabled: true
6+
entry_point: pythonactions/load_and_print_fixture.py
7+
parameters:
8+
file_path:
9+
type: "string"
10+
required: true
11+
description: "Path to the JSON fixture file to use."
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
import json
2+
3+
from st2common.runners.base_action import Action
4+
5+
6+
class LoadAndPrintFixtureAction(Action):
7+
def run(self, file_path: str):
8+
with open(file_path, "r") as fp:
9+
content = fp.read()
10+
11+
data = json.loads(content)
12+
return data
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
version: 1.0
2+
3+
description: A basic workflow which passes large data around.
4+
5+
input:
6+
- file_path
7+
- b1: <% ctx().file_path %>
8+
9+
vars:
10+
- a2: <% ctx().b1 %>
11+
- b2: <% ctx().a2 %>
12+
13+
output:
14+
- a5: <% ctx().b4 %>
15+
- b5: <% ctx().a5 %>
16+
17+
tasks:
18+
task1:
19+
action: core.echo
20+
input:
21+
message: <% ctx().b2 %>
22+
next:
23+
- when: <% succeeded() %>
24+
publish:
25+
- a3: <% result().stdout %>
26+
- b3: <% ctx().a3 %>
27+
do: task2
28+
task2:
29+
action: examples.load_and_print_fixture
30+
input:
31+
file_path: <% ctx().file_path %>
32+
next:
33+
- when: <% succeeded() %>
34+
publish: a4=<% result().result %> b4=<% ctx().a4 %>
35+
do: task3
36+
task3:
37+
action: core.noop

0 commit comments

Comments
 (0)