Just got my #IFComp2023 entry finalised, including a walkthrough and everything! Keep an eye out for when the games go live in a couple of days: if you're looking for something that's set in a colourful candyland but is still chock full of violence, sex and cannibalism, then boy do I have exactly that lined up for you! #InteractiveFiction #IFComp #parser
I found it! In the wild! Actual productive data! A #CSV-field with a line-break!
I thought it was the sort of thing that only exists in the specification, but no, there it is, I actually encountered one in real-world data!
And thus, I am proven correct once more: Don't use split, don't use regex, don't iterate over lines; use a proper CSV #parser!
And for those of you who are a bit confused right now: Yes, quoted CSV fields can totally contain newline characters.
+ FuncApp ::= Single FuncAppAux | Single + FuncAppAUx ::= FunCallee FuncAppAUx + FuncCallee ::= "(" ")" | "(" ARGS ")" + ARGS = SINGLE "," ARGS | SINGLE
A while back I made a post asking for help #playtesting my #IFComp entry - a #parser game made with #Inform. Quite a few people very kindly responded, but I'm ashamed to say I've since lost track of the post so I no longer know who they were.
Anyway. This is just to say if anybody wants to help with playtesting, please reply and let me know! The game is "Who Iced Mayor McFreeze?": essentially Who Framed Roger Rabbit meets Candy Crush. #gamedev #InteractiveFiction
#HowToThing #013 — Building a toy Lisp language and interpreter using the S-expression parser from https://thi.ng/sexpr and polymorphic multiple dispatch functions via https://thi.ng/defmulti. A small language like this can be useful for DSL purposes, user programming or for just learning about interpreters. The entire setup is highly customizable (incl. support for different kinds of S-expressions, see package readme).
Even this tiny example includes the following features: variadic math ops, ability to define new symbols/variables & functions, lexical scoping, numeric & string values...
Some example invocations are included at the end...
(Update: Minor code simplifications, updated images)
In hopes of making a better experience for itch.io players, I've tuned a release of Repeat the Ending for comfortable play/reading. I've also added instructions for "reading" RTE for those who have would prefer not to experience it as a parser game. I hope this is helpful!
I wrote a #parser in #go this week for a custom file format that has a similar, yet distinct enough grammar from C-like languages that nothing worked out of the box. I was on a roll finding a library that gave me a customisable #lexer, then completely hit a wall on the parser. Turns out just writing it yourself in a bunch of functions over a number of files is way faster and more flexible than any library out there!
Eating lunch and I suddenly thought about a project I did about fifteen years ago.
It was web UI toolkit, based on the idea of #declarative templates rerendered every time the model changed, and the model was a JSON doc that was synchronized between browser and server.
There was a #parser combinator library, expression language with a compiler and VM, and templating library.
The code is long gone but it looks like the paper is still available.
Examples of binary choices where both options seem equally good at first, but really aren't:
When writing parsers using #parser #combinators, consuming trailing whitespace is better than consuming leading whitespace. "Design patterns for parser combinators" https://dl.acm.org/doi/abs/10.1145/3471874.3472984
When writing #E2E #tests, clearing the database before each test is better than clearing it after. "Dangling state is your friend" https://docs.cypress.io/guides/references/best-practices#Dangling-state-is-your-friend
winnow - Making parsing a breeze (by epage and contributors):
winnow is a parser combinator library written in Rust that started as a fork of #nom.
As good as they were, AMFV and Spellbreaker were Infocom's worst-selling games to date. Infocom's financial situation was dire, as it appeared that game sales were not enough to undo the financial damage that Cornerstone had done to the company. Such are the conditions of Ballyhoo's production, 1986's first Infocom release, a strange mix of brilliance and missteps. Let's dive in.
В частности, идея заключается в том, что можно выгрузить всё своё окружение в файлы формата md с заданной глубиной. (У меня глубина 3 выгрузилась не полностью...)
Я исправил баг с ссылками на профиль. Ещё была досадная ошибка с рекурсивной функцией...
Если поднять свой инстанс, то не будет ограничений для #API?
Parsing time stamps faster with SIMD instructions, https://lemire.me/blog/2023/07/01/parsing-time-stamps-faster-with-simd-instructions/.
A short article, almost a note, to explain how to parse time stamps with SIMD. 10x less instructions, 5.8x faster, than strptime.
Pratt Parsing for Algebraic Expressions - Parsing algebraic expressions is always a pain. If you need to compute, say, 2+4*2... - https://hackaday.com/2023/07/03/pratt-parsing-for-algebraic-expressions/ #algebraiccomputation #softwaredevelopment #prattparser #parser
Forest: Structural Code Editing with Multiple Cursors:
"In this work, we present Forest, a structural code editor that aims to bridge the gap between the interactiveness of code editors and the expressiveness of refactoring scripts."
Better than every #AI out there, if you ask me.
Also, #Forest is a very suitable name for this. 🌲
Taking a slight detour in my play through old #interactiveFiction games. Still have #KnightOrc lined up to play properly soon. But tonight it's time to go back to the 2001 #IFComp winning game All Roads. Set in magical #Venice, which I have to admit is a draw for me. I do remember enjoying this game a lot back in 2001. #parser #textAdventure #IndieGames
"Repeat the Ending" is the "critical edition" of a forgotten 1996 Inform 5 game about loss, mental illness, and the second law of thermodynamics. It won a "best in show" ribbon in the 2023 Spring Thing festival.
A post-festival release of the game, along with several digital feelies, is now available. Everything is linked at IFDB, which also has a "play online" feature.
Thanks for your interest!
Last night I got a on a tear and wrote a complete #tokenizer for the Manatee programming language in C. I started at…9ish and finished at 1 in the morning
(It is, AFAIK, completely compliant except that I didn’t bother with Unicode. I suppose I could relatively easily augment it to use wchars…which aren’t *necessarily* Unicode but if we stick to standard C we gotta make sacrifices. __STDC_ISO_10646__ FTW I suppose.)
I suppose I should probably write a #parser over the weekend, time permitting
This is just perfect for what I need 🙌
I will open the #github repo when I feel more confident about making it public 😉
I have several #interactiveFiction #parser #textAdventure #games works in progress, that I'm in various stages of developing. Suddenly had great idea last night for one of them, reframing the opening and overall structure. Making frantic notes now before I totally forget it all. Typing into comments in my #Inform7 source code. Feeling invigorated. Hoping that my #neuro #illness allows me enough good time to complete this #game and the others satisfactorily. #IndieGames #GameDev #creativeWriting
Browsing the IFDB #InteractiveFiction database and just stumbled again across my own list of my top 10 personal favourite interactive fiction #games. The list is arranged in chronological order, from 1983 to 2017. Some predictable titles in there but also some less familiar ones. I include comments about each one. https://ifdb.org/viewlist?id=fplmp7feqc2spqwj #TextAdventure #Parser #CreativeWriting #IndieGames #GameDev #Top10 #Infocom #DouglasAdams #Tolkien #Fantasy #CallOfCthulhu #80sGames #90sGames #RetroGaming
Finished my last review for the 2023 #SpringThing #interactiveFiction competition. I played 20 of the 26 game entries, a wonderful mix of traditional #parser #textAdventure #games and choice based #Twine pieces and others. Thanks all! Here's my final sum up https://intfiction.org/t/viv-dunstan-s-2023-spring-thing-autumnal-jumble-impressions/61524/64 #GameDev #IndieGames #CreativeWriting
Sometime back, I was working on a presentation re: #BNF and had this epiphany.
The Epsilon transition was nothing to be spooked about.
For those who might not be aware: Epsilon is a terminal symbol that might
occur in a production. (A pox on my younger doofus self who thought it was the empty string).
It actually introduces nondeterminism into the #grammar rules.
The insight was that #EBNF with its regex-like symbols has no need of Epsilon.
#parser #compiler #syntax Read on ...
Uploading the #Inform7 #SourceCode for my 2 released #parser #TextAdventure #InteractiveFiction #games prompted me to look at the code again. Still wowed by that language. I know some traditional programmers find it too strange. But as a former #C and declarative language #Prolog programmer I find it magical. And the Inform 7 #IDE is like playing an #adventure game as you write. Ridiculous fun. And now I want to write and release more Inform 7 games. https://ganelson.github.io/inform-website/ #GameDev #Programming
The #Inform7 #sourceCode for my two released #parser #TextAdventure / #InteractiveFiction games (Border Reivers and Napier's Cache) is now in the #IFArchive, in the games/source/inform directory. This is part of a move this year to release more IF game source code publicly. http://www.ifarchive.org/indexes/if-archive/games/source/inform/ #ComputerGames #Adventure #HistoricalGames #ScottishGames #Code #GameDev
April 1 is IF #SourceCode Amnesty Day. I plan to join in on this, sharing my #Inform7 source code for my Scottish historical #parser #InteractiveFiction #games Border Reivers and Napier's Cache. http://blog.zarfhome.com/2023/03/april-1-is-if-source-code-amnesty-day.html #OpenSource #ScottishHistory #HistoricalGames #TextAdventures #GameDev #Inform #Coding #Programming #ComputerGames
Minimum¹ valid HTML file, did you say? Here you go:
<!doctype html><html lang><title>🤓</title>
(Tested with https://validator.w3.org/nu/#textarea)
I mean, needless to say, don’t do this. Just because you can doesn’t mean you should :)
¹ Minimum number of characters (not bytes) without errors or warnings, that is. If you don’t care about warnings, you can remove <html>. If you want the minimum number of bytes, replace the emoji with a dot or something ;)
@jsbarretto Wow, congrats, Joshua, this is absolutely mind-blowing! 🤯
On a related note:
When I first looked at your channel crate `flume`, I thought: "How are these perf numbers even possible!? And all in completely safe Rust!?"
And now `chumsky` as well! Amazing! :awesome:
It seems like everything you touch goes 🚀 😄
This is what the Rust community means by _technical excellence_!
Thank you for your hard work! ❤️
Because #ThingUmbrella has such a wide scope, sometimes groundwork laid years ago can take me a while to properly benefit from, but oh boy, when it does, it does...!!! 😍
Case in point: This week I've been finally getting around replacing the #Markdown-ish parser in https://thi.ng/hiccup-markdown. This #parser is NOT striving to be CommonMark compatible, but it does support a large MD subset and important additional features not available in standard Markdown, like custom content blocks and arbitrary metadata for _any_ block element (i.e. paragraphs, lists, codeblocks, tables, blockquotes...) — these are all features I'm urgently needing for my coursework, website generator & PKM tooling and which I'd been putting off for far too long...
Unlike the old handwritten parser, the new one is largely based on https://thi.ng/parse grammar definitions and produces a nice abstract syntax tree of the Markdown document and makes transforming the raw nodes/elements a breeze. Additionally, using the totally under-appreciated https://thi.ng/defmulti polymorphic function setup adds sheer elegance to the new implementation.
The entire new setup also is super easy to extend & customize. By default all MD syntax elements are being transformed into https://thi.ng/hiccup format, i.e. the most basic general data exchange format based on S-expression-like nested vanilla JS arrays (also the defacto intermediary for dozens of other thi.ng/umbrella packages), see attached screenshot. Just this afternoon, it took me < 30 mins to update the grammar and implement nested inline styles (e.g. italic inside bold inside strikethrough inside a nested list element).
As you can maybe tell, I'm v.excited about this all (mainly because this was a big showstopper for other things). If there's interest I might even write a blog post about these techniques used.
To get back to the previously mentioned "pay off": This entire new parser was largely developed with the parser playground tool/editor I developed in summer 2020 in the almost 3 hours of my very first YouTube live stream.
The link to the playground (incl. the grammar & examples) is at the top of this source file:
 https://www.youtube.com/watch?v=mXp92s_VP40 (recommend playback @ 1.5x speed :)
The `weld-parser` crate wasn't happy with its name. Now, we must refer to it as `weld-object`. This crate has ambitions for its life!, like supporting Elf32, MachO, COFF, and more object formats, look at this little cheeky!
Before leaving the elf64, I wanted to write some tests. Fortunately for me, parsers written with nom are really easy to test, they are just functions!
When tests are easy to write, it's a pleasure to test everything.
With this test session, I've been able to fix one panic when reading a string in a data segment with an out-of-range offset.
The Elf64 parser now understands Symbol, https://github.com/Hywan/weld/commit/8d14d7574a2d41dc56dcfb047de9a0fc9f3572dd!
A gently iterator is provided: we allocate only when necessary.
I missed few symbol types and bindings, now in https://github.com/Hywan/weld/commit/d6b1d4de6349e1650158f78a04916696c76f7b9a and https://github.com/Hywan/weld/commit/735eb0bf25bc6df0ad976680ad887e6837b0cc15.
Now I see the same information as ˋobjdump` shows me. Everything is well typed, still zero copy and few allocations.
In my test object file, I see zero relocations yet. Perfect. It’s time to link this simple object file for real 😳.
Debugging is important. I’ve added a std::fmt::Debug implementation for Elf64 header contents/data, https://github.com/Hywan/weld/commit/4ec23096b64775494226fea2e0e662bb24a410cb. It’s really helpful! It shows strings, and later the symbols…
Next stop: Defining a type for representing alignments. It must be a power of two, non zero unsigned integer. Easy. https://github.com/Hywan/weld/commit/f2e9f00dc65634b6b8145ad04c9221c6f478aaf3
With that, still zero copy but more and more semantics by leveraging the type system. Nice :-).
Note: Option<NonZeroU64> in Rust has the same size as u64. Cool! Zero cost abstraction once again.
Each program header and section header in Elf has a content. It's represented by the new `Data` type in weld, https://github.com/Hywan/weld/commit/f541364004c99aa7d666687767cb4005da3c9263 and https://github.com/Hywan/weld/commit/4ec23096b64775494226fea2e0e662bb24a410cb.
I'm experimenting this zero-copy API to request the content. Let's see where it goes. But it's very handy for debugging!
I've added a new `SectionIndex` type, https://github.com/Hywan/weld/commit/9ecaef702532bc399c12229e58e688aa05cbf3a6.
It helps dealing with the semantics of section index in Elf64, and it also helps having a valid usize, which is helpful when used as a Vec index.
It's finally less error-prone: Using the Rust type system as much as possible.
A new set of #InteractiveFiction game #awards is taking votes right now on the best games of 2022, whether traditional #parser text #adventure #games, web #choice games or another form of interactive fiction. Awards decided by public vote. If you’ve enjoyed even one piece of interactive fiction released last year please add your votes. The awards are being run via the IFDB website and more details are in the IntFiction forum. https://intfiction.org/t/2022-ifdb-awards-are-now-open/60070/ #TextAdventure #ComputerGames #CreativeWriting
Yesterday, I’ve also added support for big- and little-endian. All parser combinators can now handle endianness based on a generic type + trait, https://github.com/Hywan/weld/commit/5a1ff9f9643fe6b82e7b789e4c2cca7ee6615024.
It’s magic. Rust is cool.
So far, I’m writing the Elf64 parser. The goal is to get zero copy, period.
Yesterday I’ve added section’a data and name, still with zero copy, https://github.com/Hywan/weld/blob/bfb9fd55c5b2f9114e8f8ab21c5f49d48f9c3b98/crates/parser/src/elf64/mod.rs#L720.
It relies heavily on Rust lifetimes, and bstr to get bytes-based string-ish. The parser is written with nom, and is manipulating bytes slices only.
Python-Bibliothek Bleach erreicht Version 6 – und sagt leise Goodbye
Die Library zum Bereinigen von HTML-Inhalten aktualisiert auf Python 3.11 und behebt Probleme im Umgang mit html5lib. Bleach gilt ab sofort aber als deprecated.
@rust_discussions "In the follow up discussion we can learn more about the idea for the new linter[...] by using abstract syntax tree (AST) to understand the code structure and organize it nicely."
"twitter-archive-parser" script was updated and now is better than ever. If you want your downloaded Twitter archive still work when twitter collapses (plus tons of other enhancements) - parse your archive with https://github.com/timhutton/twitter-archive-parser
this actually started with @c_cube recommending to reuse the Yojson #json #parser, which is using a generated lexer, so it keeps its own sort of state and wouldn't be possible to reuse with an interface like this.
And then I just threaded a custom state throughout a Format Deserializer, which means anyone can pick whatever way of reading content they want.
Which yes means #yojson gets its own lexer_state type threaded.
Now if only Yojson gave me a clear peek/drop/next API, that'd be dope.
Webentwicklung: Rust-basiertes Tool Parcel CSS minifiziert schneller als esbuild
libpostal uses conditional random fields to parse natural language addresses and locations into a set of labeled parts, like house number and district. libpostal can use dictionaries to normalize parsed addresses, which determines the value of numbers, expands abbreviations, etc. libpostal supports many languages, including English, French, Russian, and Chinese.
Website 🔗️: https://github.com/openvenues/libpostal
PeppaPEG is a small #library that loads parsing expression grammars (PEGs) and parses input strings. Grammars can use several decorators to improve change how text is parsed and put into the tree, like allowing spaces, squashing children into parent nodes, case insensitivity, etc. PeppaPEG includes several example grammars, including #JSON and Golang.
Website 🔗️: https://www.soasme.com/PeppaPEG/landing.html
#ryml avoids duplicating data during parsing, instead finding the starts and ends of data in the source buffer. Once parsed, data can be indexed and converted from the string as needed. ryml's speed rivals some fast #JSON parsers when parsing YAML, and outperforms several purpose build fast JSON parsers when parsing pure JSON. ryml can also construct YAML.
Website 🔗️: https://github.com/biojppm/rapidyaml
SQLGlot is a parser as well as translator for various SQL dialects. SQLGlot offers multiple ways of parsing and dealing with SQL, including high level translation and low-level tokenizing and tree constructing interfaces. SQLGlot is the fastest pure-Python SQL parser, with performance comparable to Python bindings of sqlparser-rs for large queries.
Website 🔗️: https://github.com/tobymao/sqlglot
mjson parses JSON using a state machine without recursion or dynamic memory allocations, making it use few resources. mjson uses paths for addressing elements, making it very straightforward and simple to use. mjson supports base64 encoded binary items, SAX parsing, and operation as JSON-RPC.
Website 🔗️: https://github.com/cesanta/mjson