philipp/risp

Fork 0

Go to file

philipp d389e6c874

CI/CD Pipeline / test (push) Successful in 34s

Details

add vvg

2024-02-05 15:50:03 +01:00

.gitea/workflows

no db in this project (so far)

2024-01-26 17:24:00 +01:00

data

add vvg

2024-02-05 15:50:03 +01:00

src

add vvg

2024-02-05 15:50:03 +01:00

.gitignore

empty rust project

2023-11-03 13:45:25 +01:00

.gitlab-ci.yml

run live test on ci

2023-11-06 13:36:33 +01:00

add vvg

2024-02-05 15:50:03 +01:00

Cargo.lock

update deps

2024-01-26 16:50:44 +01:00

Cargo.toml

add hacks for abgb

2023-11-07 09:51:08 +01:00

download_all_pars_from_list.sh

create test for full urhg for overview parsing

2023-11-06 12:00:20 +01:00

README.md

better docs

2024-02-05 11:40:28 +01:00

README.md

RISolve

Folder

./data
- cache -> cache for overview tests
- expected
  - overview -> expected xml links of law_ids

Add new law text

Tests

Getting paragraphs from law_id (risparser::overview::test::parse())
- Create file law_id in ./data/expected/overview (then run tests to get current output + save in file)

Features (to be moved to lib.rs one-by-one)

Text to structured law
- LawBuilder: Structure law, by specifying (sub-)sections (new_header), its description (new_desc), paragraphs under the current (sub-)section (new_par), and the description of the next paragraph (new_next_para_header). Classifier need to be set.
  - Main output: Properly structured laws (Law)
- Law: Represents a structured law text. Can be generated with LawBuilder.
  - Main output: properly formatted (md for a start) law text, no need to export Heading/... etc
RIS Fetcher (to be mocked)
- all paragraphs of specific law (overview)
- xml document from url (par/mod.rs fetch_age)
Parser
- replace errors w/ config file

Integration test

Nice test would be to re-create html ris file and compare it (problem with custom fixes, though)

History

Goals

I want to have the text of the law.
I want to see the structure (proper headers) of the law.
I want to be able to make comments (e.g. Erschöpfung) on certain parts
I want to see since when this paragraph is in use.
[~] Lawtext should be updateable

Technical

I don't want to restrict myself with a parser combinators but code it myself using recursive descent parser.
Be strict in what I process. Fail if anything unexpected happens. The user should handle this case. It's fine if one decides to ignore the new/unexpected field, but it should be done deliberately.

Progress / Functions

Parse structure of law into struct using Deserilize trait, pot. multiple requests (if > 100 paragraphs)
Parse risdok using own RD parser, again strict: fail if anything not expected happens, not sure (yet) if I want to operate on strings, or first parse using off-the-shelve XML reader (prob. 2nd option)

Next step

Parse ABGB
Create config files for laws
- law_id
- replace stuff
- headers
Create argument parse
- --law mschg.conf

Naming

Law ("Gesetz"): e.g. UHG, TEG, ABGB
Section ("Paragraph")
Subsection ("Absatz")
Item ("Ziffer")
Heading-{1,2,3,...}

"Scripts"

Retrieve overview law: curl -X POST "https://data.bka.gv.at/ris/api/v2.6/Bundesrecht" -H "Content-Type: application/x-www-form-urlencoded" -d "Applikation=BrKons" -d "Gesetzesnummer=10001899" -d "DokumenteProSeite=OneHundred" -d "Seitennummer=1" -d "Fassung.FassungVom=2023-11-03" | jq . > law.json