Masthash

#parser

Damon L. Wakes
3 days ago

Just got my #IFComp2023 entry finalised, including a walkthrough and everything! Keep an eye out for when the games go live in a couple of days: if you're looking for something that's set in a colourful candyland but is still chock full of violence, sex and cannibalism, then boy do I have exactly that lined up for you! #InteractiveFiction #IFComp #parser

The cover art for Who Iced Mayor McFreeze?, showing a sort of blue ice slurry emerging from a grating machine. The shirt has a conspicuous eyeball embedded in it.

I found it! In the wild! Actual productive data! A #CSV-field with a line-break!

I thought it was the sort of thing that only exists in the specification, but no, there it is, I actually encountered one in real-world data!

And thus, I am proven correct once more: Don't use split, don't use regex, don't iterate over lines; use a proper CSV #parser!

And for those of you who are a bit confused right now: Yes, quoted CSV fields can totally contain newline characters.

Lewis
5 days ago
yoxem
6 days ago

想到如何消除 #左遞迴 的方法了 https://git.kianting.info/?p=clo;a=commitdiff;h=18cb2328ffb2364203f3aefedaf76f1686bb2eb3;ds=sidebyside 以後再試。

+    FuncApp ::= Single FuncAppAux | Single
+    FuncAppAUx ::= FunCallee FuncAppAUx
+    FuncCallee ::= "(" ")" | "(" ARGS ")"
+    ARGS = SINGLE "," ARGS | SINGLE

#Clo #Parser

꧁~Naive-Cea~꧂
1 week ago

Looks like #earthquakes affect the #internet z too !
:blobcatgiggle2:
#NonSenseAlcea #php #parser

Brbbrrrr

cd /storage/emulated/0/Download 
ffmpeg -i *.mp4 -vf "scale=2000:1200:force_original_aspect_ratio=decrease,pad=2000:1200:(ow-iw)/2:(oh-ih)/2" -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 128k output_video.mp4
꧁~Naive-Cea~꧂
1 week ago

@don

Ich frag mich ob es sich nen #parser schreiben ließe der fediverse.observer nach den kriterien in #echtzeit abklopft

:blobcatthinkowo:

Damon L. Wakes
2 weeks ago

A while back I made a post asking for help #playtesting my #IFComp entry - a #parser game made with #Inform. Quite a few people very kindly responded, but I'm ashamed to say I've since lost track of the post so I no longer know who they were.

Anyway. This is just to say if anybody wants to help with playtesting, please reply and let me know! The game is "Who Iced Mayor McFreeze?": essentially Who Framed Roger Rabbit meets Candy Crush. #gamedev #InteractiveFiction

You #know I'm bored, when I pimp my terribad #dekudeals #parser yet again.

This time:
#Chibi #images

#CodeAlcea

OLD
NEW !!
playest
2 weeks ago

Hey #rust people, I'm trying to implement a #parser with #nom able to parse basic expressions like "a = 2 + 3 - 1" and "b = 2 * (3 + 5)".

I tried but I'm kind of stuck. Do you have any tutorial or example projet that could help me?

Karsten Schmidt
1 month ago

#HowToThing #013 — Building a toy Lisp language and interpreter using the S-expression parser from https://thi.ng/sexpr and polymorphic multiple dispatch functions via https://thi.ng/defmulti. A small language like this can be useful for DSL purposes, user programming or for just learning about interpreters. The entire setup is highly customizable (incl. support for different kinds of S-expressions, see package readme).

Even this tiny example includes the following features: variadic math ops, ability to define new symbols/variables & functions, lexical scoping, numeric & string values...

Some example invocations are included at the end...

Source code:
https://github.com/thi-ng/umbrella/blob/develop/packages/sexpr/README.md#interpreter

(Update: Minor code simplifications, updated images)

#ThingUmbrella #TypeScript #JavaScript #Parser #DSL #Interpreter #Lisp #Tutorial

Screenshot of 1st part of the linked TypeScript source code...
Screenshot of 2nd part of the linked TypeScript source code...

In hopes of making a better experience for itch.io players, I've tuned a release of Repeat the Ending for comfortable play/reading. I've also added instructions for "reading" RTE for those who have would prefer not to experience it as a parser game. I hope this is helpful!

https://kamin3ko.itch.io/repeat-the-ending/devlog/574431/release-35-for-repeat-the-ending-reader-mode-and-more

#interactivefiction #inform7 #parser

2 months ago

I wrote a #parser in #go this week for a custom file format that has a similar, yet distinct enough grammar from C-like languages that nothing worked out of the box. I was on a roll finding a library that gave me a customisable #lexer, then completely hit a wall on the parser. Turns out just writing it yourself in a bunch of functions over a number of files is way faster and more flexible than any library out there!

C.
2 months ago

@folkerschamel

Oops, so I am :)

It does look buggy to me, but I am really not the one to ask about the nitty-gritty details of the #parser.

#cheat

lorddimwit
2 months ago

Eating lunch and I suddenly thought about a project I did about fifteen years ago.

It was web UI toolkit, based on the idea of #declarative templates rerendered every time the model changed, and the model was a JSON doc that was synchronized between browser and server.

There was a #parser combinator library, expression language with a compiler and VM, and templating library.

The code is long gone but it looks like the paper is still available.

https://www.usenix.org/legacy/event/webapps10/tech/full_papers/King.pdf

2 months ago

Examples of binary choices where both options seem equally good at first, but really aren't:

When writing parsers using #parser #combinators, consuming trailing whitespace is better than consuming leading whitespace. "Design patterns for parser combinators" https://dl.acm.org/doi/abs/10.1145/3471874.3472984

When writing #E2E #tests, clearing the database before each test is better than clearing it after. "Dangling state is your friend" https://docs.cypress.io/guides/references/best-practices#Dangling-state-is-your-friend

@thorcik mój #parser przeczytał: "różańcówki"

Jan :rust: :ferris:
2 months ago

winnow - Making parsing a breeze (by epage and contributors):

https://github.com/winnow-rs/winnow

winnow is a parser combinator library written in Rust that started as a fork of #nom.

Not sure yet, if I should migrate from nom 4.* to #winnow or #chumsky. I really do like the good error recovery and parsing capabilities for PEGs of chumsky.🥰

#Rust #RustLang #Parser #CrateTip

As good as they were, AMFV and Spellbreaker were Infocom's worst-selling games to date. Infocom's financial situation was dire, as it appeared that game sales were not enough to undo the financial damage that Cornerstone had done to the company. Such are the conditions of Ballyhoo's production, 1986's first Infocom release, a strange mix of brilliance and missteps. Let's dive in.

https://golmac.org/ballyhoo-and-the-rest-of-it/

#infocom #infocomodon #interactivefiction #parser #standarddifficulty

Mark
3 months ago

@rf @tech Если кто не в курсе, то я решил сделать свой маленький #parser.

В частности, идея заключается в том, что можно выгрузить всё своё окружение в файлы формата md с заданной глубиной. (У меня глубина 3 выгрузилась не полностью...)

И проанализировать #SocialGraph сторонними инструментами.
В частности, я планирую использовать #Obsidian.

Я исправил баг с ссылками на профиль. Ещё была досадная ошибка с рекурсивной функцией...

Если поднять свой инстанс, то не будет ограничений для #API?

На скриншоте изображен интерфейс Obsidian.
В левой части открыт файл Markdown в котором записана информация о юзере MarkVobl. В правой части представлен социальный граф сети Mastodon с глубиной 3 от MarkVobl.
Ivan Enderlin 🦀
3 months ago

Parsing time stamps faster with SIMD instructions, https://lemire.me/blog/2023/07/01/parsing-time-stamps-faster-with-simd-instructions/.

A short article, almost a note, to explain how to parse time stamps with SIMD. 10x less instructions, 5.8x faster, than strptime.

#performance #parser #simd

IT News
3 months ago

Pratt Parsing for Algebraic Expressions - Parsing algebraic expressions is always a pain. If you need to compute, say, 2+4*2... - https://hackaday.com/2023/07/03/pratt-parsing-for-algebraic-expressions/ #algebraiccomputation #softwaredevelopment #prattparser #parser

Jan :rust: :ferris:
3 months ago

Forest: Structural Code Editing with Multiple Cursors:

https://arxiv.org/abs/2210.11124

"In this work, we present Forest, a structural code editor that aims to bridge the gap between the interactiveness of code editors and the expressiveness of refactoring scripts."

Better than every #AI out there, if you ask me.

Also, #Forest is a very suitable name for this. 🌲

#CodeEditor #DevTools #Refactoring #StructuralEditing #AST #Parser

Vivienne Dunstan
4 months ago

Taking a slight detour in my play through old #interactiveFiction games. Still have #KnightOrc lined up to play properly soon. But tonight it's time to go back to the 2001 #IFComp winning game All Roads. Set in magical #Venice, which I have to admit is a draw for me. I do remember enjoying this game a lot back in 2001. #parser #textAdventure #IndieGames

Screen shot of All Roads playing in the Lectrote interpreter on a Mac. The text on screen says "Scaffold in the City Square. Your head is pulled back, held by the rope pressing on your throat. Your toes pivot on a rickety stool, which shakes as your legs shake. The crowd filling the square are chattering like monkeys, but they are just flits of colour in the corners of your eyes. You cannot look round - instead you gaze straight across the Square, to the great dial of the Clock."

"Repeat the Ending" is the "critical edition" of a forgotten 1996 Inform 5 game about loss, mental illness, and the second law of thermodynamics. It won a "best in show" ribbon in the 2023 Spring Thing festival.

A post-festival release of the game, along with several digital feelies, is now available. Everything is linked at IFDB, which also has a "play online" feature.

Thanks for your interest!

https://ifdb.org/viewgame?id=eueqjtej7bvnfp5a

#InteractiveFiction #Parser #Inform7 #repeattheending

lorddimwit
5 months ago

Last night I got a on a tear and wrote a complete #tokenizer for the Manatee programming language in C. I started at…9ish and finished at 1 in the morning

(It is, AFAIK, completely compliant except that I didn’t bother with Unicode. I suppose I could relatively easily augment it to use wchars…which aren’t *necessarily* Unicode but if we stick to standard C we gotta make sacrifices. __STDC_ISO_10646__ FTW I suppose.)

I suppose I should probably write a #parser over the weekend, time permitting

I asked for help in the #Rust community #discord server about how I could implement a #hsml #parser in the best way and if there is something I should take into account while developing

The two helpers `@Insomnia` and `@Sir Toastie` were really helpful 🧡
They explained some stuff and told me about the #nom #crate

https://docs.rs/nom

This is just perfect for what I need 🙌

I will open the #github repo when I feel more confident about making it public 😉

Vivienne Dunstan
5 months ago

I have several #interactiveFiction #parser #textAdventure #games works in progress, that I'm in various stages of developing. Suddenly had great idea last night for one of them, reframing the opening and overall structure. Making frantic notes now before I totally forget it all. Typing into comments in my #Inform7 source code. Feeling invigorated. Hoping that my #neuro #illness allows me enough good time to complete this #game and the others satisfactorily. #IndieGames #GameDev #creativeWriting

Vivienne Dunstan
5 months ago

Browsing the IFDB #InteractiveFiction database and just stumbled again across my own list of my top 10 personal favourite interactive fiction #games. The list is arranged in chronological order, from 1983 to 2017. Some predictable titles in there but also some less familiar ones. I include comments about each one. https://ifdb.org/viewlist?id=fplmp7feqc2spqwj #TextAdventure #Parser #CreativeWriting #IndieGames #GameDev #Top10 #Infocom #DouglasAdams #Tolkien #Fantasy #CallOfCthulhu #80sGames #90sGames #RetroGaming

Nightz12
5 months ago

Learning about writing my own programming language in C++.

All fun and easy till you want "good" error recovery while parsing 😩

#Coding #Parser #Programming

Vivienne Dunstan
5 months ago

Finished my last review for the 2023 #SpringThing #interactiveFiction competition. I played 20 of the 26 game entries, a wonderful mix of traditional #parser #textAdventure #games and choice based #Twine pieces and others. Thanks all! Here's my final sum up https://intfiction.org/t/viv-dunstan-s-2023-spring-thing-autumnal-jumble-impressions/61524/64 #GameDev #IndieGames #CreativeWriting

Gaël Reyrol
5 months ago

A code rewrite tool for structural search and replace that supports ~every language #ocaml #parser - https://github.com/comby-tools/comby #dailylinks

#json #dekudeals

I kinda wanna code a #parser for this ..
(also has #csv option...)

🤔

https://www.dekudeals.com/collection.json
Ed Howland
6 months ago

Sometime back, I was working on a presentation re: #BNF and had this epiphany.
The Epsilon transition was nothing to be spooked about.
For those who might not be aware: Epsilon is a terminal symbol that might
occur in a production. (A pox on my younger doofus self who thought it was the empty string).
It actually introduces nondeterminism into the #grammar rules.
The insight was that #EBNF with its regex-like symbols has no need of Epsilon.
#parser #compiler #syntax Read on ...

Hanon Ondricek
6 months ago

If you're a creator, player, archivist or fan of #interactivefiction we welcome you to join the discussion at intfiction.org!

https://intfiction.org

#if #textadventures #inform7 #twine #parser #CYOA #xyzzy #adventuregame #infocom #zork

lorddimwit
6 months ago

Going to the #parser store, anybody need anything?

A photo of a business’s sign saying “get cash ast and easy”
Vivienne Dunstan
6 months ago

Uploading the #Inform7 #SourceCode for my 2 released #parser #TextAdventure #InteractiveFiction #games prompted me to look at the code again. Still wowed by that language. I know some traditional programmers find it too strange. But as a former #C and declarative language #Prolog programmer I find it magical. And the Inform 7 #IDE is like playing an #adventure game as you write. Ridiculous fun. And now I want to write and release more Inform 7 games. https://ganelson.github.io/inform-website/ #GameDev #Programming

Vivienne Dunstan
6 months ago

The #Inform7 #sourceCode for my two released #parser #TextAdventure / #InteractiveFiction games (Border Reivers and Napier's Cache) is now in the #IFArchive, in the games/source/inform directory. This is part of a move this year to release more IF game source code publicly. http://www.ifarchive.org/indexes/if-archive/games/source/inform/ #ComputerGames #Adventure #HistoricalGames #ScottishGames #Code #GameDev

Vivienne Dunstan
7 months ago
Aral Balkan
7 months ago

Minimum¹ valid HTML file, did you say? Here you go:

<!doctype html><html lang><title>🤓</title>

(Tested with https://validator.w3.org/nu/#textarea)

I mean, needless to say, don’t do this. Just because you can doesn’t mean you should :)

¹ Minimum number of characters (not bytes) without errors or warnings, that is. If you don’t care about warnings, you can remove <html>. If you want the minimum number of bytes, replace the emoji with a dot or something ;)

#html #forgiving #parser #validation #web #dev

Jan :rust: :ferris:
7 months ago

@jsbarretto Wow, congrats, Joshua, this is absolutely mind-blowing! 🤯

On a related note:
When I first looked at your channel crate `flume`, I thought: "How are these perf numbers even possible!? And all in completely safe Rust!?"

And now `chumsky` as well! Amazing! :awesome:

It seems like everything you touch goes 🚀 😄

This is what the Rust community means by _technical excellence_!

Thank you for your hard work! ❤️

#RustLang #Rust #Performance #BlazinglyFast #Parser #TechnicalExcellence

Karsten Schmidt
7 months ago

Because #ThingUmbrella has such a wide scope, sometimes groundwork laid years ago can take me a while to properly benefit from, but oh boy, when it does, it does...!!! 😍

Case in point: This week I've been finally getting around replacing the #Markdown-ish parser in https://thi.ng/hiccup-markdown. This #parser is NOT striving to be CommonMark compatible, but it does support a large MD subset and important additional features not available in standard Markdown, like custom content blocks and arbitrary metadata for _any_ block element (i.e. paragraphs, lists, codeblocks, tables, blockquotes...) — these are all features I'm urgently needing for my coursework, website generator & PKM tooling and which I'd been putting off for far too long...

Unlike the old handwritten parser, the new one is largely based on https://thi.ng/parse grammar definitions[1] and produces a nice abstract syntax tree of the Markdown document and makes transforming the raw nodes/elements a breeze. Additionally, using the totally under-appreciated https://thi.ng/defmulti polymorphic function setup adds sheer elegance to the new implementation.

The entire new setup also is super easy to extend & customize. By default all MD syntax elements are being transformed into https://thi.ng/hiccup format, i.e. the most basic general data exchange format based on S-expression-like nested vanilla JS arrays (also the defacto intermediary for dozens of other thi.ng/umbrella packages), see attached screenshot. Just this afternoon, it took me < 30 mins to update the grammar and implement nested inline styles (e.g. italic inside bold inside strikethrough inside a nested list element).

As you can maybe tell, I'm v.excited about this all (mainly because this was a big showstopper for other things). If there's interest I might even write a blog post about these techniques used.

To get back to the previously mentioned "pay off": This entire new parser was largely developed with the parser playground tool/editor I developed in summer 2020 in the almost 3 hours of my very first YouTube live stream[2].

The link to the playground (incl. the grammar & examples) is at the top of this source file:

[1] https://github.com/thi-ng/umbrella/blob/develop/packages/hiccup-markdown/src/parse.ts

[2] https://www.youtube.com/watch?v=mXp92s_VP40 (recommend playback @ 1.5x speed :)

Screenshot excerpt from the https://thi.ng/hiccup-markdown readme showing a super basic parse example and conversion from Markdown to JS/hiccup and then serialization to HTML
Ivan Enderlin 🦀
7 months ago

#weld

The `weld-parser` crate wasn't happy with its name. Now, we must refer to it as `weld-object`. This crate has ambitions for its life!, like supporting Elf32, MachO, COFF, and more object formats, look at this little cheeky!

https://github.com/Hywan/weld/commit/7556abeb5d80f015e92634d8b9a6c1494e815b9e

#RustLang #parser #object #elf64

Ivan Enderlin 🦀
7 months ago

#weld

Before leaving the elf64, I wanted to write some tests. Fortunately for me, parsers written with nom are really easy to test, they are just functions!

https://github.com/Hywan/weld/compare/13be3d2917f945cd347e9394c07853ac692f32de...9e6d55e2e137df3786b257228c7e35042520c0a7

When tests are easy to write, it's a pleasure to test everything.

With this test session, I've been able to fix one panic when reading a string in a data segment with an out-of-range offset.

#RustLang #test #elf64 #parser

A screenshot of one test.
Ivan Enderlin 🦀
8 months ago

The Elf64 parser now understands Symbol, https://github.com/Hywan/weld/commit/8d14d7574a2d41dc56dcfb047de9a0fc9f3572dd!
A gently iterator is provided: we allocate only when necessary.

I missed few symbol types and bindings, now in https://github.com/Hywan/weld/commit/d6b1d4de6349e1650158f78a04916696c76f7b9a and https://github.com/Hywan/weld/commit/735eb0bf25bc6df0ad976680ad887e6837b0cc15.

Now I see the same information as ˋobjdump` shows me. Everything is well typed, still zero copy and few allocations.

In my test object file, I see zero relocations yet. Perfect. It’s time to link this simple object file for real 😳.

#elf #parser #RustLang

Ivan Enderlin 🦀
8 months ago

Debugging is important. I’ve added a std::fmt::Debug implementation for Elf64 header contents/data, https://github.com/Hywan/weld/commit/4ec23096b64775494226fea2e0e662bb24a410cb. It’s really helpful! It shows strings, and later the symbols…

#elf #parser #RustLang

Ivan Enderlin 🦀
8 months ago

Next stop: Defining a type for representing alignments. It must be a power of two, non zero unsigned integer. Easy. https://github.com/Hywan/weld/commit/f2e9f00dc65634b6b8145ad04c9221c6f478aaf3

With that, still zero copy but more and more semantics by leveraging the type system. Nice :-).

Note: Option<NonZeroU64> in Rust has the same size as u64. Cool! Zero cost abstraction once again.

#elf #parser #RustLang

Ivan Enderlin 🦀
8 months ago

Each program header and section header in Elf has a content. It's represented by the new `Data` type in weld, https://github.com/Hywan/weld/commit/f541364004c99aa7d666687767cb4005da3c9263 and https://github.com/Hywan/weld/commit/4ec23096b64775494226fea2e0e662bb24a410cb.

I'm experimenting this zero-copy API to request the content. Let's see where it goes. But it's very handy for debugging!

#elf #parser

Ivan Enderlin 🦀
8 months ago

I've added a new `SectionIndex` type, https://github.com/Hywan/weld/commit/9ecaef702532bc399c12229e58e688aa05cbf3a6.

It helps dealing with the semantics of section index in Elf64, and it also helps having a valid usize, which is helpful when used as a Vec index.

It's finally less error-prone: Using the Rust type system as much as possible.

#elf #parser

Vivienne Dunstan
8 months ago

A new set of #InteractiveFiction game #awards is taking votes right now on the best games of 2022, whether traditional #parser text #adventure #games, web #choice games or another form of interactive fiction. Awards decided by public vote. If you’ve enjoyed even one piece of interactive fiction released last year please add your votes. The awards are being run via the IFDB website and more details are in the IntFiction forum. https://intfiction.org/t/2022-ifdb-awards-are-now-open/60070/ #TextAdventure #ComputerGames #CreativeWriting

Ivan Enderlin 🦀
8 months ago

Yesterday, I’ve also added support for big- and little-endian. All parser combinators can now handle endianness based on a generic type + trait, https://github.com/Hywan/weld/commit/5a1ff9f9643fe6b82e7b789e4c2cca7ee6615024.

It’s magic. Rust is cool.

#linker #parser #elf #RustLang

Ivan Enderlin 🦀
8 months ago

So far, I’m writing the Elf64 parser. The goal is to get zero copy, period.

Yesterday I’ve added section’a data and name, still with zero copy, https://github.com/Hywan/weld/blob/bfb9fd55c5b2f9114e8f8ab21c5f49d48f9c3b98/crates/parser/src/elf64/mod.rs#L720.

It relies heavily on Rust lifetimes, and bstr to get bytes-based string-ish. The parser is written with nom, and is manipulating bytes slices only.

bstr: https://blog.burntsushi.net/bstr/
nom: https://github.com/rust-bakery/nom

#linker #parser #elf #RustLang

heise online
8 months ago

Python-Bibliothek Bleach erreicht Version 6 – und sagt leise Goodbye

Die Library zum Bereinigen von HTML-Inhalten aktualisiert auf Python 3.11 und behebt Probleme im Umgang mit html5lib. Bleach gilt ab sofort aber als deprecated.

https://www.heise.de/news/Python-Bibliothek-Bleach-erreicht-Version-6-und-sagt-leise-Good-Bye-7469422.html?wt_mc=sm.red.ho.mastodon.mastodon.md_beitraege.md_beitraege

#HTML #Mozilla #Parser #Python

Jan :rust: :ferris:
10 months ago

@rust_discussions "In the follow up discussion we can learn more about the idea for the new linter[...] by using abstract syntax tree (AST) to understand the code structure and organize it nicely."

How else would someone lint code if not with an #AST? What algorithm/data structure does #ESLint use currently?

#Compiler #Linter #Parser #AbstractSyntaxTree

Comfortably Numb
10 months ago

"twitter-archive-parser" script was updated and now is better than ever. If you want your downloaded Twitter archive still work when twitter collapses (plus tons of other enhancements) - parse your archive with https://github.com/timhutton/twitter-archive-parser

#python #script #twitter #archive #parser

Leandro
10 months ago

this actually started with @c_cube recommending to reuse the Yojson #json #parser, which is using a generated lexer, so it keeps its own sort of state and wouldn't be possible to reuse with an interface like this.

And then I just threaded a custom state throughout a Format Deserializer, which means anyone can pick whatever way of reading content they want.

Which yes means #yojson gets its own lexer_state type threaded.

Now if only Yojson gave me a clear peek/drop/next API, that'd be dope.

Ivan Enderlin 🦀
1 year ago

clap 4.0, a Rust CLI argument parser, https://epage.github.io/blog/2022/09/clap4/.

The author explains in details what the 4.0 release provide: Lot of code removed, faster to compile, faster runtime, more significant compilation features, simpler code…

Excellent work!

#cli #parser #rustlang

Ivan Enderlin 🦀
1 year ago

Lightning CSS, https://lightningcss.dev/.

An extremely fast and performant CSS parser, transformer, bundler and minifier.

Written in Rust, based on Servo's `cssparser` and `selectors` crates (used by Firefox). Available in NPM.

Save bandwidth easily :-).

#css #minifier #parser #rustlang

Parcel CSS, ein neuer Parser, Compiler und Minifier, macht bestehenden Tools Konkurrenz. Es legt den Fokus auf Performance.
Webentwicklung: Rust-basiertes Tool Parcel CSS minifiziert schneller als esbuild
2 years ago

#libpostal is a #parser for street addresses.

libpostal uses conditional random fields to parse natural language addresses and locations into a set of labeled parts, like house number and district. libpostal can use dictionaries to normalize parsed addresses, which determines the value of numbers, expands abbreviations, etc. libpostal supports many languages, including English, French, Russian, and Chinese.

Website 🔗️: https://github.com/openvenues/libpostal

#free #opensource #foss #fossmendations #OSM #ML

2 years ago

#PeppaPEG is a simple #PEG #parser for #C.

PeppaPEG is a small #library that loads parsing expression grammars (PEGs) and parses input strings. Grammars can use several decorators to improve change how text is parsed and put into the tree, like allowing spaces, squashing children into parent nodes, case insensitivity, etc. PeppaPEG includes several example grammars, including #JSON and Golang.

Website 🔗️: https://www.soasme.com/PeppaPEG/landing.html

#free #opensource #foss #fossmendations #programming

2 years ago

#RapidYAML is a fast #YAML #parser for #Cpp.

#ryml avoids duplicating data during parsing, instead finding the starts and ends of data in the source buffer. Once parsed, data can be indexed and converted from the string as needed. ryml's speed rivals some fast #JSON parsers when parsing YAML, and outperforms several purpose build fast JSON parsers when parsing pure JSON. ryml can also construct YAML.

Website 🔗️: https://github.com/biojppm/rapidyaml

#free #opensource #foss #fossmendations #programming

2 years ago

#SQLGlot is a #Python #SQL #parser and transpiler.

SQLGlot is a parser as well as translator for various SQL dialects. SQLGlot offers multiple ways of parsing and dealing with SQL, including high level translation and low-level tokenizing and tree constructing interfaces. SQLGlot is the fastest pure-Python SQL parser, with performance comparable to Python bindings of sqlparser-rs for large queries.

Website 🔗️: https://github.com/tobymao/sqlglot

#free #opensource #foss #fossmendations #programming

2 years ago

#mjson is a small #JSON #parser and emitter.

mjson parses JSON using a state machine without recursion or dynamic memory allocations, making it use few resources. mjson uses paths for addressing elements, making it very straightforward and simple to use. mjson supports base64 encoded binary items, SAX parsing, and operation as JSON-RPC.

Website 🔗️: https://github.com/cesanta/mjson

#free #opensource #foss #fossmendations #programming #embedded