#elasticsearch
DoorDash is hiring Senior Fullstack Engineer, Experimentation Platform
🔧 #java #kotlin #python #typescript #react #graphql #rest #aws #elasticsearch #postgresql #redis #seniorengineer
🌎 San Francisco, United States; Sunnyvale, CA
⏰ Full-time
🏢 DoorDash
Job details https://jobsfordevelopers.com/jobs/senior-fullstack-engineer-experimentation-platform-at-doordash-aug-17-2023-555244?utm_source=mastodon.world&ref=mastodon.world
#jobalert #jobsearch #hiring
Criss que c'est compliqué #ElasticSearch ! :flan_rage:
Qu'est-ce que j'peux vraiment utiliser avec une licence "basic" parmi les paramètres xpack.security.* ?
Leur exemple suppose la licence "trial" (https://www.elastic.co/guide/en/elasticsearch/reference/7.17/configuring-tls-docker.html) qui expire au bout de 30 jours.
J'suis même pas certain que ce que j'essaie de faire est possible...
#mastoadmin
I have #elasticsearch running in docker with security enabled, which means only connections with https (not http) are allowed.
But mastodon seems to use only http, is that correct?
Do I have to de-secure elasticsearch?!
Yesterday at #TPDL2023 David Pride presented “CORE-GPT: Combining Open Access research and large language models for credible, trustworthy question answering”
Rather than #ZeroShot question/answering, Pride’s team combines the #CORE #OpenAccess dataset with #ElasticSearch to create #FewShot prompts that leverage the strength of combining #search results with the #LLM’s (#GPT) #summarization abilities to produce an answer to a user’s question including citations.

▪️ @cybernews research▪️ #DarkBeam left an #Elasticsearch and #Kibana interface unprotected, exposing records with user #emails and #passwords from previously reported and non-reported data breaches.
#dataleak #cybersecurity #datasecurity #infosec

Migrated from a comparatively weedy #DigitalOcean droplet to a beefier #Hetzner server, and it’s cheaper.
Took a quite a bit of tinkering to get it working, particularly around nginx and certificates, but everything seems to have come across. Gone from 2GB RAM/50GB SSD to 8GB/80GB. Should have capacity for #ElasticSearch now, but that’s for another day.
Woo, turns out a version bump on the #elasticsearch chart caught a (fairly recent!) bugfix that solves the issue.
J'vais être franc, je travaille surtout sur #ElasticSearch en ce moment.
Puisqu'il roulera sur un serveur différent que celui de Mastodon, je dois étudier le fonctionnement des connexions distantes, chiffrées, avec authentification et faire rouler ça dans un conteneur Docker. Par défaut, ça n'accepte que les connexions locales, sans authentification ni chiffrement.
Migrating #Elasticsearch for #MastodonSG, search function will be temporarily down.
From what I can tell, the issue with #elasticsearch is in the readiness (and potentially the liveness) probe. At the very least, its default timings are way too eager for my cluster, with each node failing before it can discover the other 7.
I slowed both probes way down, which is allowing the stack to see each other and start accepting commands from other pods. Unfortunately I think the probes are still failing.
I've noticed a lot of chatter about setting up Elasticsearch for Mastodon 4.2's new full text search over the last few days, including what hardware is required, how difficult is it, etc.
So I thought I’d write down my experience, including the hardware I'm running Elasticsearch on for my single user instance:
https://blog.thms.uk/2023/09/mastodon-elasticsearch?utm_source=mastodon
#mastoAdmin #singleUserInstance #FullTextSearch #Elasticsearch
Creia que al activar les cerques de tuts públics pel seu contingut gràcies al nou servei #ElasticSearch, el servidor ho notaria. Doncs no és així, tot funciona fluid i ràpid com sempre. 👏
I la cerca de tuts és pràcticament instantània.
Apt meckert bei dem Elasticsearch-Repository.
Ziehe den Key neu und füge den Key hinzu.
Apt ist immer noch der Meinung, der Schlüssel sei valide.
Ich schaue ich bei Elasticsearch in die Doku. Nein, ist der richtige Key.
Dinge, für die ich eigentlich nicht Zeit investieren möchte...
#debian #elasticsearch #adminlife
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/deb.html
@enusbaum I went to check if I needed to update #Elasticsearch on my #Mastodon server and then realized that I don't have Elasticsearch on my server because I have 4GB of RAM on my server and Elasticsearch would die on my server.
Quan fem cerques de tuts pel seu contingut tenim diverses opcions de cerca, com es veu en l'imatge.
Per exemple, volem cercar tuts que continguin la paraula "Catalunya" publicats abans del 1 d'octubre del 2017. En el camp de cerca posaríem:
Catalunya before:2017-10-01

Ja ha acabat! 👏
Ja tenim la cerca de tuts públics 100% operativa.
Recordar que només es trobaran els continguts dels nostres tuts si ho permetem en el perfil, pestanya "Privacitat i arribar", opció "Inclou tuts públics en els resultats de cerca".

Genial! els desenvolupadors de l'app #Ivory per #iOS van alliberar aquesta darrera versió que és compatible amb #ElasticSearch de #Mastodon v4.2.0 que permet cercar-hi tuts públics pel seu contingut. 👏

Indexar els tuts públics fa anar les CPUs del servidor al 102% 😲
Els 16 cores / 32 fils del AMD Ryzen 9 5950X 16-Core Processor treballant de valent.

En té per unes quantes hores per acabar d'indexar-los però la cerca de tuts públics pel seu contingut ja està operativa.

How to use Elasticsuite to boost the search results by a manually defined sorting relevance? This can be solved just by configuring Elasticsuite, no code adaptions are needed. I've documented the steps in my bitExpert blog post https://blog.bitexpert.de/blog/elasticsuite_sort_relevance_boost
Ja tinc #ElasticSearch activat i mastodont.cat configurat per a que en faci ús.
Ara li he dit que començi a indexar tuts públics. Trigarà més de 6 hores en acabar de fer-ho perquè son 74.364.024 tuts (més de 74 milions).


@spla vaig estar llegint un fil on el company @chris rebia un consell al respecte. Mira't aquest article, dóna uns tips per mantenir la memòria del #ElasticSearch sota control:
L'enquesta ha acabat amb un clar 82% a favor (https://mastodont.cat/@spla/111108053681316138).
Avui quan tingui una estona activaré #ElasticSearch per a que es puguin cercar tuts públics per el seu contingut.
Recordem que per defecte no poden trobar el contingut dels nostres tuts públics si no li donem permís a Elastic Search en el nostre perfil:
https://mastodont.cat/settings/profile
Pestanya "Privacitat i arribar", secció "Cerca", opció "Inclou tuts públics en els resultats de cerca". Cal activar-ho i desar els canvis.

I'm going to have to work with #elasticsearch this week 😬 send me your thoughts and prayers
If you are lowering your -Xms / -Xmx memory for #ElasticSearch, you better check it once and a while because, at some point, it will get a kill signal it if it is set too low. #MastoAdmin
Quickest way for myself, without logging into ssh, is to go to Mastodon Admin dashboard and it will tell you at the top if there is an issue with ElasticSearch.

@hirad I had but while checking I noticed I was bit by a #mastodon #issue solved using https://github.com/mastodon/mastodon/issues/17145#issuecomment-998190956
My server is currently #rebuilding my #ElasticSearch indexes.
I tried to enable #Elasticsearch in my #Mastodon solo instance, and it triggered a huge workload to index around 17 million documents.
This process would have taken several days to complete with the current resources allocated to the instance.
I opted for terminating the process and disabling Elasticsearch, but I'd like to hear other experiences:
- Is this only a temporary behavior? (ie. after initial indexation it becomes easier on the server).
- Is there any periodic 'cleanup' process? (ie. my server ingest many GB of toots every day, will I end up with a huge Elasticsearch database?).
- Am I missing something valuable for not implementing Elasticsearch?
Looking forward to hear your thoughts.
Booyah!!!
Got Searching working for #MastodonSG on v4.2!
Took the chance to upgrade the ram and #Elasticsearch version as well!
It’s gonna take awhile to re-index.
#MastoAdmin
Older posts are starting to get interactions and traction.
I presume this is because of the new Mastodon search feature and people finding the posts.
So are there search parameters that can be used on #elasticsearch?
Let’s say: 1️⃣ I want to see all posts but not replied from “@ someone @ some server”
2️⃣Or I want to find all posts with with “Mario” but not “Nintendo” (or AND “Nintendo”)
3️⃣Or all posts with “cat” before September 20, 2023.
I don’t see any documentation on whether or not this can be done. 😕
#mastodonhelp #search
Lots of pain caused by #ElasticSearch in the update.
We're still indexing.... some 24 hours later. no end in sight.
#Mastodon v4.2.0 ist installiert, Such-Indizierung freigegeben und bei dieser Gelegenheit auch noch ein paar weitere Ecken des Servers feucht durchgewischt.
Mal gucken, ob alles läuft und dann nächste Woche ggf. noch mal ein #ElasticSearch bzw. #OpenSearch installieren.
Bisher habe ich darin nicht so recht den Sinn gesehen, aber mit den neuen Änderungen könnte das nun vielleicht tatsächlich nützlich werden. Wobei ich befürchte, dass am Ende kaum einer die Suche für das eigene Profil freigibt.
Dreckverschmiert komme ich nach vielen Stunden aus dem Maschinenraum zurück und vermelde: Wir auf #WueSocial reisen nun mit Warp 4.2 weiter durch das #Fediverse. Puh! 😅
Ich habe uns dazu noch eine Volltextsuche spendiert. #Elasticsearch
Das war jetzt wirklich ein wilder Ritt, bis zu den Ellbogen war ich in der Postgres-Datenbank - weiter, als es mir lieb war. Aber es scheint zu funktionieren. 🤞
https://blog.joinmastodon.org/2023/09/mastodon-4.2/
#Elasticsearch は導入していないので全文検索の機能は使えない。めんどくちゃい。ハッシュタグで見られるから皆さんありがとうございます。#fedibird
@chris @anderspuck I'm happy with Elasticsearch performance after reducing the heap to 2GB:
heapCurrent heapPercent heapMax
951.9mb 46 2gb
Qu'est-ce que je devrais installer en premier sur l'instance #Mastodon de #FédiQuébec, #ElasticSearch / #OpenSearch pour pouvoir faire des recherches étendues, ou #LibreTranslate pour pouvoir traduire les pouets directement dans l'interface web de Mastodon?
Would any #mastoadmins be able to point me to a good document to enable #elasticsearch? Would the official documents be okay for an instance already in existence.
📢 Calling all #opensource #selfhosting #greenhosting #hosting boffins on the #fediverse!
I'm running various instances of #mastodon 4.2 (all using @cloudron which makes doing such things easy)
I'd like to connect at least one of them (the biggest most public one) to an instance of #ElasticSearch to take advantage of the new #fulltextsearch feature featured in #masto 4.2.
Where is the best place to get a reliable affordable and #renewable energy powered one-click install of Elastic Search? 🙏
can confirm. The new fulltext search works great, and is awesome! I just searched for "footiMac" which I know only I use and it returned results very very quickly. I can go way back. Including to it's very first mention back in January!
This ability is so very important for the usability and attractiveness of Mastodon!
#footiMac #Mastodon #Mastodon4dot2 #search #ElasticSearch #mastoAdmin #selfhost #selfhosted
@umiamz I just reenabled ElasticSearch and made that modification to the jvm options and instead of 13.4GB/15.6GB I'm at 7.3GB of 15.6GB. Excellent. Thank you for the tip. I will tweak as required. ElasticSearch is definitely taking some CPU time as well but this could be just from being new, so I'll let it run for a few days and see what the experience is like.
#MastoAdmin #ElasticSearch #footiMac #selfhost #selfhosted #Debian
@anderspuck

@chris Yes - create a file in
/etc/elasticsearch/jvm.options.d/
e.g. memory.options, containing
-Xms2048M
-Xmx2048M
then restart Elasticsearch. That will reduce it to around 2GB. By default it takes half the RAM, hence your 7-8GB usage :)
You can always raise it if it's too little for it to work properly.
#Mastodon 4.2がリリース ~プライバシーに配慮した形で、他ユーザーの公開投稿の検索ができるように | gihyo .jp
https://gihyo.jp/article/2023/09/mastodon-4.2
『投稿の検索を動作させるにはサーバー側でElasticsearchの導入が必要となっている。Elasticsearchのコストを考えてこれまで導入していなかったサーバーも多かったが、この機会にElasticsearchの導入を行うサーバーも増えるものと考えられる』
やはり今回は検索がメインみたいだね。リストの件は先送り?
ちなみに #Elasticsearch とは『大容量データを処理することを想定した全文検索エンジン』らしい→ https://www.designet.co.jp/faq/term/?id=RWxhc3RpY3NlYXJjaA#:~:text=Elasticsearch%E3%81%A8%E3%81%AF%E3%80%81%E5%A4%A7%E5%AE%B9%E9%87%8F,%E3%82%92%E6%8B%85%E5%BD%93%E3%81%97%E3%81%A6%E3%81%84%E3%82%8B%E3%80%82
On rappel aux développeurs QU'ON NE PARAMÉTRE UN PROGRAMME VIA UNE API MAIS VIA UN FICHIER DE CONFIGURATION.
Tas de connards de pisseurs de code.
Oh right, I haven't deployed #ElasticSearch yet. I'll do that another time, maybe after I ran all the cleanups recommended by @ricard in https://ricard.dev/improving-mastodons-disk-usage/
#Mastodon #MastoAdmin
Unser Server bildung.social läuft jetzt mit der Version 4.2.0 mit deutlich verbesserter Suchfunktion. Damit eigene Beiträge gefunden werden, muss man unter Einstellungen - Öffentliches Profil - Tab oben "Datenschutz und Reichweite" einstellen: Profil und Beiträge in Suchalgorithmen berücksichtigen.
#suche #mastodon #elasticsearch
@fries @favstarmafia

Kan brukere slå på tillatelse til søk i #Mastodon, selv om instansen man selv er på ikke har aktivert dette f.eks. pga. ressurskravene til #Elasticsearch?
querqy + #elasticsearch for (de)compounding, which german and some other languages love 😅
#HaystackConf

"Reciprocal Rank Fusion (RRF)" or how to stop worrying about boosting — slides from #HaystackConf: https://xeraa.net/talks/reciprocal-rank-fusion/
First half is on the algorithm in general, second on how to use it in #elasticsearch

Fent proves amb #ElasticSearch, cercant tuts pel seu contingut.
Estic veient els recursos de servidor que consumeix i valorant si es pot activar aquí, a mastodont.cat.
Recordem que la propera versió que el projecte #Mastodon alliberarà demà, la v4.2.0, permet fer cerques de tuts per el seu contingut (cal permís individual de cada usuari).

multilingual support (and its problems 😅) in #lucene / #elasticsearch by @lucianprecup at #haystackconf

Very much interested to deploy Mastodon 4.2 full text search functionality on my small instance.
Anyone experienced how much additional resources #Elasticsearch takes? Is it better to dedicate separate VM for it?
Will the #Mastodon 4.2 update now require #ElasticSearch or will it still be an optional feature?
#Mastodon uses #Elasticsearch and hence as a #mastoadmin I have to deal with scaling the latter, which raises questions.
ES isn't as elastic as the name suggests, right? Or is it possible to quickly add and remove cluster nodes to adapt to the query load? With load changing predictably throughout the day (most users in one time zone), it would be a shame (economically and ecologically) to have a cluster running that's sized for peak loads, idling most of the time.
#elasticsearch and the full #elastic stack 8.10.1 is out. #kibana was leaking sensitive information in its logs (see https://discuss.elastic.co/t/kibana-8-10-1-security-update/343287), so upgrade away
elastic cloud should be patched up by us and it was the first time that we had to pull a release — fun weekend operation 🙈
Terrified of Terraform relicensing?
Lessons learned from relicensing of Terraform, Elasticsearch, Redis and more, and practical guidance for vetting and using open source wisely
https://horovits.medium.com/331d83f182c
#terraform #relicensing #opentf #opensource #oss #elasticsearch #redis
#IF (and #IT's a #BIG #IF) you want #AllThoseConditions met; you could use #ElasticSearch; #IT's an #OptionalExtra... And, #IT's a #WorkInProgress, at #TheSameTime... | #Concurrency
🧙⚔️ 🤖🐺🤖⚔️🧙 | ☕🎠🦹🦄🦹🎠☕
https://syzito.xyz/@mastodonmigration@mastodon.online/110957761627347943
Trying to figure out how to configure (https://docs.joinmastodon.org/admin/optional/elasticsearch) and enable #ElasticSearch (https://github.com/elastic/elasticsearch) on our instance to be able to try the new beta masto release's #FullTextSearch feature https://github.com/mastodon/mastodon/pull/26344
#MastoAdmin any advice/recommendations for minimum CPU/storage/RAM etc requirements? I know it will be resource heavy but want to get a sense of how much would we need to beef up our server...
The new #Mastodon full text search feature coming in v4.2 uses #ElasticSearch. I wonder what they know, that others don't? The articles I have read on ES suggest it is quite slow and resource intense.
I'll continue my quest for enlightenment.
On a side note, I have to say that I find #Meilisearch can really punch our Firefish server sometimes too
La propera versió de #Mastodon, la v4.2.0, inclou la novetat de cercar text i trobar, per tant, tuts que continguin les paraules que estem cercant.
Cal tenir activar #ElasticSearch, que gasta força recursos de màquina (a mastodont.cat mai ho he activat), però també cal que configurem el nostre perfil per a donar permís a que el text dels nostres tuts es pugui cercar.
@null So I just sat down to try this and it didn't seem to work, my ram usage after restart immediately goes to 70% no matter the file options. turns out that the jvm.options file says:
## WARNING: DO NOT EDIT THIS FILE. If you want to override the
## JVM options in this file, or set any additional options, you
## should create one or more files in the jvm.options.d
## directory containing your adjustments.
I wouldn't even know how to format a new file...😬 #Elasticsearch #linux #mastoadmin #mastodon #fediverse
"our team decided to go with only scripting instead of query"
when someone tries to reimplement #elasticsearch's match query on keyword fields with painless 🫣
Various Mastodon services show a LOT of warnings about ElasticSearch security in the logs. The default config and docs provided by Mastodon leave it unsecured, but it's really not hard to enable.
I finally got around to doing it today, and it was painless.
I'll list the basic steps in the next toot. 🧵
It feels laughably complex for what I'm working on, but I'm really liking this #logging stack of #OpenTelemetry, #Jaeger, #ElasticSearch, and #Kibana. OpenTelemetry's spans are really cool, and you get a lot out of the box with the auto-instrumentation (automatic spans for supported libraries like Postgresql and Nextjs). Attaching properties to spans is the cherry on top.
I'm not super happy with how to logs show up on Kibana, you end up with a weird nested query syntax. But it works!
DoorDash is hiring Software Engineer - Data Insights and Metrics Platform
🔧 #aws #elasticsearch
🌎 Remote; United States
⏰ Full-time
💰 $179k - $243k
🏢 DoorDash
Job details https://jobsfordevelopers.com/jobs/software-engineer-data-insights-and-metrics-platform-at-doordash-nov-22-2022-1f1bd4?utm_source=mastodon.world&ref=mastodon.world
#jobalert #jobsearch #hiring
ES|QL: a new piped Query Language for Elasticsearch. why?
1. already many options but none that work well across #elasticsearch and #kibana; plus the challenge of writing JSON queries for humans
2. a query engine. with dramatic performance improvements 🚀
https://www.elastic.co/blog/elasticsearch-query-language-esql
"Accelerating vector search with SIMD instructions" on the #elastic blog: https://www.elastic.co/blog/accelerating-vector-search-simd-instructions
going native with java's project panama in #elasticsearch makes a difference for vector search. and great to see the validation from users: https://github.com/elastic/elasticsearch/issues/92260#issuecomment-1664725129
Yeahhhh I don’t think this is gonna save me any money. Mastodon seems to connect to Elasticsearch pretty regularly, so the VM just keeps waking up a few minutes after it goes to sleep.
load testing with #K6, storing results in #elasticsearch, and visualizing it in #kibana: https://medium.com/@gabriel.nascimento2048/get-better-performance-insights-by-using-elasticsearch-and-k6-through-docker-for-load-testing-ff5b0324433a
great blog to get you started and bonus points for the cute logo
I’m experimenting with having Elasticsearch shut down when it’s not being used and start on demand when it’s needed using Fly.io’s autostart feature, to make running it as cheap as possible.
So far it seems to be working, at the expense of a ~30 second wait if you search for something when Elasticsearch isn’t running.
https://github.com/rgrove/pie.gd/commit/048a76c345c70ccc133e3f942a886d9bbad21950
I added Elasticsearch to my tiny Mastodon instance hosted on Fly.io and you can too! Was hoping to give it a small VM to keep the cost down, but ES won't even deign to respond to an HTTP request without at least 2GB of RAM at its disposal, so I guess we'll see how this goes.
https://github.com/rgrove/pie.gd/commit/2438e98be6aa5992e644f0c15bdb395ff1853f73
It's #Elasticsearch. Eats up too much memory. I have now allocated less for it. #MastoAdmin
Does anyone have Elasticsearch enabled on their single-user instance? What specs does this require?
great answer on #elasticsearch rollups vs downsampling and why the first is being replaced by the second (from https://github.com/elastic/elasticsearch/issues/72545#issuecomment-1604764622)
PS: I'm never quite sure where is up and down in this context 😅
the slow boiling 🐸 (the good kind): continuous improvements on #elasticsearch data ingestion over the last few minor versions — from simple keywords to heavy kNN vectors, as well as ingest-pipeline-heavy ingestion workloads
https://www.elastic.co/blog/data-ingestion-elasticsearch
Have you done Elasticsearch administration at scale (~ PB+), live in the US, and are looking for a job?
If so please let me know. I don’t have a job rec to point to yet, but an adjacent team is starting to hire.
#hiring #elasticsearch #techjobs
(Please boost)
Question to the #mastodon #selfhosting crowd: is it worth the resource and cost to run #elasticsearch for advanced search and/or libretranslate for translations on your server? What is the RAM impact you’re seeing on a small 1-10 users instance? Any feedback greatly appreciated.
@patrick_h_lauke @philsherry @Mastodon You can do it and @mastohost allow you to buy #ElasticSearch as an add-on which does precisely what you are asking for:
https://masto.host/about-full-text-search-elasticsearch/
"When enabled, users on a Mastodon server can search posts related to themselves. Examples: Posts they published, Posts they favourited, Posts where they were mentioned, Their direct messages"
But since you're on mastodon.social I'm not sure that will get enabled. Its expensive.
#elasticsearch Relevance Engine (ESRE, spoken ez-ray) — combining all recent work on search from #elastic: traditional + vector + hybrid search, custom and third-party transformer models, normalization of scores,...
all on the _search endpoint you know :)
https://www.elastic.co/blog/may-2023-launch-announcement
we're having another #elastic meetup in vienna tonight: https://www.meetup.com/elastic-usergroup-vienna/events/292714224/
1. configuring your elasticsearch cluster on kubernetes with a custom resource (the git / kubernetes way)
2. #elasticsearch vector and hybrid search
entertainment for a bad weather day ;)
@atomicpoet Huh‽ #Misskey & #Mastodon have search.
Misskey turns their off by default (you can easily turn it on within settings) & Mastodon requires you to implement the #ElasticSearch code on your instance. I use search on both of my solo instances.
Misskey does not have an official app, but #MissRirica replicates most (if not all) of the options on the web interface (something most official apps on other platforms can not do yet).