Remove 4 26 by dalyw · Pull Request #15 · we3lab/wwtp-process-extraction

dalyw · 2026-05-20T17:56:09Z

No description provided.

In unitprocess_json file: Adding Denitrification Filter to UP list Moving anaerobic filter out of fixed film category Moving biosolids lagoon out of disposal category and adding Lagoon as secondary category Cleaning up some alt_names Deleting old llm output files (using facility name rather than place ID)

Modified LLM prompt to further encourage structured output and adherence to ontology categories Expanding San Jose example to include solids, disinfection Renaming "truth" to "manual reading" throughout after figure_2

Renaming cwns_processes_by_facility to cwns_unit_processes_by_facility for consistency with LLM file Updating README

fletchapin

Overall it looks great! I just had a minor comment about the date folder

fletchapin · 2026-05-30T00:23:29Z

 sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))

-DATE_FOLDER = "2026-5-15"
+DATE_FOLDER = "2026-5-25"


Is the date important for publication? I guess I'm wondering if we can just remove a level of nesting and publish this data directly in the output folder

The date only matters if we re-run every ~6-12 months as permits are updated to keep separate versions of the results.

But we could keep these analysis scripts “flat” and then save any date-specific versions in the Stanford Digital Repository output file - how does that sound?

fletchapin · 2026-05-30T00:23:52Z

 from helpers.plotting import make_grouped_legend, save_and_close, set_thick_spines

-DATE_FOLDER = "2026-5-15"
+DATE_FOLDER = "2026-5-25"


Same question about the date

Adding final unit_processes_by_facility.csv file with both datasets Fixing bug in step4 where "offsite" laction wasn't being used Re-running model comparison with final ontology and Place ID suffix on filenames Updating step5_llm_extraction with higher token limits for gpt-5 and to save manifest/token csv rows after every facility

dalyw added 8 commits May 16, 2026 14:40

Deleting 4-26 run data, replaced with 5-15

d77cbcb

Merge branch 'main' into remove-4-26

22bd22f

Merge branch 'main' into remove-4-26

c7babd2

Removing 4-26 post-merge

2bfc566

Adding GHG comparison based on El Abbadi

a555b25

New results from 5-25 run

a724d0b

Modified LLM prompt to further encourage structured output and adherence to ontology categories Expanding San Jose example to include solids, disinfection Renaming "truth" to "manual reading" throughout after figure_2

Deleting nutrient map and N2O hatched figure

d48607e

Renaming cwns_processes_by_facility to cwns_unit_processes_by_facility for consistency with LLM file Updating README

dalyw marked this pull request as ready for review May 29, 2026 20:19

dalyw requested a review from fletchapin May 29, 2026 20:19

fletchapin reviewed May 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove 4 26#15

Remove 4 26#15
dalyw wants to merge 9 commits into
mainfrom
remove-4-26

dalyw commented May 20, 2026

Uh oh!

fletchapin left a comment

Uh oh!

fletchapin May 30, 2026

Uh oh!

dalyw May 30, 2026

Uh oh!

fletchapin May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dalyw commented May 20, 2026

Uh oh!

fletchapin left a comment

Choose a reason for hiding this comment

Uh oh!

fletchapin May 30, 2026

Choose a reason for hiding this comment

Uh oh!

dalyw May 30, 2026

Choose a reason for hiding this comment

Uh oh!

fletchapin May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants