Go to file
philipp 09be03e630
All checks were successful
CI/CD Pipeline / test (push) Successful in 34s
fix ci:
2024-02-05 14:38:07 +01:00
.gitea/workflows no db in this project (so far) 2024-01-26 17:24:00 +01:00
data add paragraph parser to lib, add test for teg 2024-02-05 14:28:57 +01:00
src fix ci: 2024-02-05 14:38:07 +01:00
.gitignore empty rust project 2023-11-03 13:45:25 +01:00
.gitlab-ci.yml run live test on ci 2023-11-06 13:36:33 +01:00
Cargo.lock update deps 2024-01-26 16:50:44 +01:00
Cargo.toml add hacks for abgb 2023-11-07 09:51:08 +01:00
download_all_pars_from_list.sh create test for full urhg for overview parsing 2023-11-06 12:00:20 +01:00
README.md better docs 2024-02-05 11:40:28 +01:00

RISolve

Folder

  • ./data
    • cache -> cache for overview tests
    • expected
      • overview -> expected xml links of law_ids

Add new law text

Tests

  • Getting paragraphs from law_id (risparser::overview::test::parse())
    • Create file law_id in ./data/expected/overview (then run tests to get current output + save in file)

Features (to be moved to lib.rs one-by-one)

  • Text to structured law
    • LawBuilder: Structure law, by specifying (sub-)sections (new_header), its description (new_desc), paragraphs under the current (sub-)section (new_par), and the description of the next paragraph (new_next_para_header). Classifier need to be set.
      • Main output: Properly structured laws (Law)
    • Law: Represents a structured law text. Can be generated with LawBuilder.
      • Main output: properly formatted (md for a start) law text, no need to export Heading/... etc
  • RIS Fetcher (to be mocked)
    • all paragraphs of specific law (overview)
    • xml document from url (par/mod.rs fetch_age)
  • Parser
    • replace errors w/ config file

Integration test

  • Nice test would be to re-create html ris file and compare it (problem with custom fixes, though)

History

Goals

  • I want to have the text of the law.
  • I want to see the structure (proper headers) of the law.
  • I want to be able to make comments (e.g. Erschöpfung) on certain parts
  • I want to see since when this paragraph is in use.
  • [~] Lawtext should be updateable

Technical

  • I don't want to restrict myself with a parser combinators but code it myself using recursive descent parser.
  • Be strict in what I process. Fail if anything unexpected happens. The user should handle this case. It's fine if one decides to ignore the new/unexpected field, but it should be done deliberately.

Progress / Functions

  • Parse structure of law into struct using Deserilize trait, pot. multiple requests (if > 100 paragraphs)
  • Parse risdok using own RD parser, again strict: fail if anything not expected happens, not sure (yet) if I want to operate on strings, or first parse using off-the-shelve XML reader (prob. 2nd option)

Next step

  • Parse ABGB
  • Create config files for laws
    • law_id
    • replace stuff
    • headers
  • Create argument parse
    • --law mschg.conf

Naming

  • Law ("Gesetz"): e.g. UHG, TEG, ABGB
  • Section ("Paragraph")
  • Subsection ("Absatz")
  • Item ("Ziffer")
  • Heading-{1,2,3,...}

"Scripts"

  • Retrieve overview law: curl -X POST "https://data.bka.gv.at/ris/api/v2.6/Bundesrecht" -H "Content-Type: application/x-www-form-urlencoded" -d "Applikation=BrKons" -d "Gesetzesnummer=10001899" -d "DokumenteProSeite=OneHundred" -d "Seitennummer=1" -d "Fassung.FassungVom=2023-11-03" | jq . > law.json