risp/README.md

RISolve

# Folder
- ./data
	- cache -> cache for `overview` tests
	- expected
		- overview -> expected xml links of law_ids

# Add new law text
## Tests
- Getting paragraphs from `law_id` (`risparser::overview::test::parse()`)
	- Create file `law_id` in `./data/expected/overview` (then run tests to get current output + save in file)
- Parsing paragraphs: add test in `src/risparser/paragraph/mod.rs`
- 


# Features (to be moved to lib.rs one-by-one)
- Text to structured law
	- `LawBuilder`: Structure law, by specifying (sub-)sections (`new_header`), its description (`new_desc`), paragraphs under the current (sub-)section (`new_par`), and the description of the next paragraph (`new_next_para_header`). `Classifier` need to be set.
		- Main output: Properly structured laws (`Law`)
	- `Law`: Represents a structured law text. Can be generated with `LawBuilder`.
		- Main output: properly formatted (md for a start) law text, no need to export Heading/... etc
- RIS Fetcher (to be mocked)
	- all paragraphs of specific law (`overview`)
	- xml document from url (`par/mod.rs fetch_age`)
- Parser
	- replace errors w/ config file

# Integration test
- Nice test would be to re-create html ris file and compare it (problem with custom fixes, though)

# History
- [I've created my first parser using RIS API, daily updated. Failed because I tried to do too much automatically (e.g. recognizing headers](https://gitlab.com/PhilippHofer/law)
- [Using print-website, I've extracted stuff w/ regex.](https://gitlab.com/PhilippHofer/ris/)
- [Tried to create a parser using print-website, proper(-ish) parser](https://gitlab.com/PhilippHofer/ris2)

# Goals

- [x] I want to have the text of the law.
- [x] I want to see the structure (proper headers) of the law.
- [ ] I want to be able to make comments (e.g. Erschöpfung) on certain parts
- [ ] I want to see since when this paragraph is in use.
- [~] Lawtext should be updateable

# Technical

- I don't want to restrict myself with a [parser combinators](docs.rs/nom) but code it myself using *recursive descent* parser.
- Be strict in what I process. Fail if anything unexpected happens. The user should handle this case. It's fine if one decides to ignore the new/unexpected field, but it should be done deliberately.

# Progress / Functions

- [x] Parse structure of law into struct using Deserilize trait, pot. multiple requests (if > 100 paragraphs)
- [x] Parse risdok using own *RD parser*, again strict: fail if anything not expected happens,  not sure (yet) if I want to operate on strings, or first parse using off-the-shelve XML reader (prob. 2nd option)
  
# Next step

- [x] Parse ABGB
- [ ] Create config files for laws
	- law_id
	- replace stuff
	- headers
- [ ] Create argument parse
	- `--law mschg.conf`

# Naming

- Law ("Gesetz"): e.g. UHG, TEG, ABGB
- Section ("Paragraph")
- Subsection ("Absatz")
- Item ("Ziffer")
- Heading-{1,2,3,...} 
  
  
# "Scripts"
- Retrieve overview law: `curl -X POST "https://data.bka.gv.at/ris/api/v2.6/Bundesrecht" -H "Content-Type: application/x-www-form-urlencoded" -d "Applikation=BrKons" -d "Gesetzesnummer=10001899" -d "DokumenteProSeite=OneHundred" -d "Seitennummer=1" -d "Fassung.FassungVom=2023-11-03" | jq . > law.json`
Start paring ris law overview 2023-11-03 22:40:19 +01:00			`RISolve`

Create tests for first lib-component: overview; Fixes #1 2024-02-05 11:36:06 +01:00			`# Folder`
			`- ./data`
			- cache -> cache for `overview` tests
better docs 2024-02-05 11:40:28 +01:00			`- expected`
			`- overview -> expected xml links of law_ids`
Create tests for first lib-component: overview; Fixes #1 2024-02-05 11:36:06 +01:00
better docs 2024-02-05 11:40:28 +01:00			`# Add new law text`
			`## Tests`
Create tests for first lib-component: overview; Fixes #1 2024-02-05 11:36:06 +01:00			- Getting paragraphs from `law_id` (`risparser::overview::test::parse()`)
better docs 2024-02-05 11:40:28 +01:00			- Create file `law_id` in `./data/expected/overview` (then run tests to get current output + save in file)
restructure + update readme 2024-02-05 16:03:36 +01:00			- Parsing paragraphs: add test in `src/risparser/paragraph/mod.rs`
			`-`
better docs 2024-02-05 11:40:28 +01:00
Create tests for first lib-component: overview; Fixes #1 2024-02-05 11:36:06 +01:00
add plan for future refarctoring 2024-02-04 12:11:34 +01:00			`# Features (to be moved to lib.rs one-by-one)`
			`- Text to structured law`
			- `LawBuilder`: Structure law, by specifying (sub-)sections (`new_header`), its description (`new_desc`), paragraphs under the current (sub-)section (`new_par`), and the description of the next paragraph (`new_next_para_header`). `Classifier` need to be set.
			- Main output: Properly structured laws (`Law`)
			- `Law`: Represents a structured law text. Can be generated with `LawBuilder`.
			`- Main output: properly formatted (md for a start) law text, no need to export Heading/... etc`
			`- RIS Fetcher (to be mocked)`
			- all paragraphs of specific law (`overview`)
			- xml document from url (`par/mod.rs fetch_age`)
			`- Parser`
			`- replace errors w/ config file`

			`# Integration test`
			`- Nice test would be to re-create html ris file and compare it (problem with custom fixes, though)`

initial thoughts 2023-11-03 14:26:18 +01:00			`# History`
			`- [I've created my first parser using RIS API, daily updated. Failed because I tried to do too much automatically (e.g. recognizing headers](https://gitlab.com/PhilippHofer/law)`
			`- [Using print-website, I've extracted stuff w/ regex.](https://gitlab.com/PhilippHofer/ris/)`
			`- [Tried to create a parser using print-website, proper(-ish) parser](https://gitlab.com/PhilippHofer/ris2)`

			`# Goals`

update todos 2023-11-06 17:37:22 +01:00			`- [x] I want to have the text of the law.`
add todos 2023-11-07 08:50:04 +01:00			`- [x] I want to see the structure (proper headers) of the law.`
initial thoughts 2023-11-03 14:26:18 +01:00			`- [ ] I want to be able to make comments (e.g. Erschöpfung) on certain parts`
			`- [ ] I want to see since when this paragraph is in use.`
update todos 2023-11-06 17:37:22 +01:00			`- [~] Lawtext should be updateable`
initial thoughts 2023-11-03 14:26:18 +01:00
			`# Technical`

			`- I don't want to restrict myself with a [parser combinators](docs.rs/nom) but code it myself using recursive descent parser.`
			`- Be strict in what I process. Fail if anything unexpected happens. The user should handle this case. It's fine if one decides to ignore the new/unexpected field, but it should be done deliberately.`

			`# Progress / Functions`

update todos 2023-11-06 17:37:22 +01:00			`- [x] Parse structure of law into struct using Deserilize trait, pot. multiple requests (if > 100 paragraphs)`
add todos 2023-11-07 08:50:04 +01:00			`- [x] Parse risdok using own RD parser, again strict: fail if anything not expected happens, not sure (yet) if I want to operate on strings, or first parse using off-the-shelve XML reader (prob. 2nd option)`
initial thoughts 2023-11-03 14:26:18 +01:00
add next step 2023-11-03 14:28:37 +01:00			`# Next step`

add hacks for abgb 2023-11-07 09:51:22 +01:00			`- [x] Parse ABGB`
add todos 2023-11-07 08:50:04 +01:00			`- [ ] Create config files for laws`
			`- law_id`
			`- replace stuff`
			`- headers`
			`- [ ] Create argument parse`
			- `--law mschg.conf`
Start paring ris law overview 2023-11-03 22:40:19 +01:00
			`# Naming`

			`- Law ("Gesetz"): e.g. UHG, TEG, ABGB`
			`- Section ("Paragraph")`
			`- Subsection ("Absatz")`
			`- Item ("Ziffer")`
			`- Heading-{1,2,3,...}`
parse first paragraph of wuchergesetz 2023-11-04 10:07:43 +01:00

			`# "Scripts"`
			- Retrieve overview law: `curl -X POST "https://data.bka.gv.at/ris/api/v2.6/Bundesrecht" -H "Content-Type: application/x-www-form-urlencoded" -d "Applikation=BrKons" -d "Gesetzesnummer=10001899" -d "DokumenteProSeite=OneHundred" -d "Seitennummer=1" -d "Fassung.FassungVom=2023-11-03" \| jq . > law.json`
trigger ci 2024-01-26 16:55:15 +01:00