juan_gandhi: (VP)
Juan-Carlos Gandhi ([personal profile] juan_gandhi) wrote2015-10-01 02:02 pm
Entry tags:

long id

I think I got it where all this bs about passing around numerical ids of entities instead of entity references (maybe lazy) come from. It's like 'error code'. It comes from the ancient c programming, where we just could not allocate a string for a readable piece of text, or for the data that may need some efforts to instantiate or allocate.

In short. It's stupid to pass around "ids" in a program.

[identity profile] rssh.livejournal.com 2015-10-01 09:10 pm (UTC)(link)
Hmm, is not 'persistent reference <=> id' ?

[identity profile] juan-gandhi.livejournal.com 2015-10-01 09:44 pm (UTC)(link)
It is, but why pass it around if we can have a lazy data provider in memory and point to that memory; persistent reference hiding inside.

(no subject)

[identity profile] sassa-nf.livejournal.com - 2015-10-01 22:09 (UTC) - Expand

(no subject)

[identity profile] juan-gandhi.livejournal.com - 2015-10-02 00:27 (UTC) - Expand

(no subject)

[identity profile] sassa-nf.livejournal.com - 2015-10-02 06:49 (UTC) - Expand
dennisgorelik: 2020-06-13 in my home office (Default)

[personal profile] dennisgorelik 2015-10-01 09:25 pm (UTC)(link)
> It's stupid to pass around "ids" in a program

What if you are passing "ids" to database (to retrieve full records)?
Is it still stupid?

[identity profile] juan-gandhi.livejournal.com 2015-10-01 09:45 pm (UTC)(link)
That's technical. ids are like addresses. You don't have to see them.

(no subject)

[personal profile] dennisgorelik - 2015-10-01 22:36 (UTC) - Expand

(no subject)

[identity profile] juan-gandhi.livejournal.com - 2015-10-02 00:28 (UTC) - Expand

(no subject)

[personal profile] dennisgorelik - 2015-10-02 01:00 (UTC) - Expand

(no subject)

[identity profile] exceeder.livejournal.com - 2015-10-02 01:31 (UTC) - Expand

(no subject)

[personal profile] dennisgorelik - 2015-10-02 01:49 (UTC) - Expand

(no subject)

[identity profile] exceeder.livejournal.com - 2015-10-02 02:11 (UTC) - Expand

(no subject)

[personal profile] dennisgorelik - 2015-10-02 02:54 (UTC) - Expand

(no subject)

[identity profile] yatur.livejournal.com - 2015-10-02 06:41 (UTC) - Expand

(no subject)

[personal profile] dennisgorelik - 2015-10-02 09:56 (UTC) - Expand

(no subject)

[identity profile] exceeder.livejournal.com - 2015-10-02 07:08 (UTC) - Expand

(no subject)

[personal profile] dennisgorelik - 2015-10-02 10:02 (UTC) - Expand

(no subject)

[identity profile] exceeder.livejournal.com - 2015-10-02 17:18 (UTC) - Expand

(no subject)

[personal profile] dennisgorelik - 2015-10-02 20:33 (UTC) - Expand

(no subject)

[personal profile] dennisgorelik - 2015-10-02 02:04 (UTC) - Expand

(no subject)

[identity profile] exceeder.livejournal.com - 2015-10-02 02:28 (UTC) - Expand

[identity profile] vit-r.livejournal.com 2015-10-01 09:26 pm (UTC)(link)
Когда количество перерастает в качество и таблица перестаёт помещаться в памяти сервера, начинаются интересные вещи.

Потом ещё великолепны будут способы вычисления уникальности.

[identity profile] rezkiy.livejournal.com 2015-10-01 09:43 pm (UTC)(link)
< rant>

so what do you do if you actually DO fail to allocate? you pass an exception through whatever means. Let's now talk about how your cyclomatic complexity affects your ability to reason about correctness of your program.

If you never learnt to code in imperative programming languages (such as C/C++), it doesn't exactly mean that those that did learn are stupid or something.

< /rant>

[identity profile] juan-gandhi.livejournal.com 2015-10-01 09:46 pm (UTC)(link)
You don't have to allocate everything for everything. Conceptually it's here; actually it's not, and all we have, encapsulated, is that id. But we don't want to see it.

They are not stupid. They are just from an old age. mid-XX century.
Edited 2015-10-01 21:46 (UTC)

(no subject)

[identity profile] rezkiy.livejournal.com - 2015-10-01 21:55 (UTC) - Expand

(no subject)

[identity profile] juan-gandhi.livejournal.com - 2015-10-02 00:30 (UTC) - Expand

(no subject)

[identity profile] rezkiy.livejournal.com - 2015-10-02 01:03 (UTC) - Expand

(no subject)

[identity profile] gineer.livejournal.com - 2015-10-02 07:36 (UTC) - Expand

(no subject)

[identity profile] metaclass.livejournal.com - 2015-10-02 09:52 (UTC) - Expand

[identity profile] no more turtles (from livejournal.com) 2015-10-01 09:47 pm (UTC)(link)
Имеет ли эта мудрость отношение к тому, что юниксовский сервер под полным лоадом с аптаймами больше года это в порядке вещей, а жабовский сервер без особой загрузки мне приходится бегать и перегружать каждую неделю, потому что OOM exception и вся машина висит как зомби (да, вся, жаба неубиваема!), хотя и на пинг реагирует.

[identity profile] juan-gandhi.livejournal.com 2015-10-02 12:33 am (UTC)(link)
Конечно имеет. Открываем на телефоне шелл и телнетим на сервер, чтобы жужу почитать в аски. Желательно в виде номеров анекдотов.

(no subject)

[identity profile] no more turtles - 2015-10-02 17:02 (UTC) - Expand

[identity profile] cema.livejournal.com 2015-10-01 11:36 pm (UTC)(link)
Yes, it's dome more C legacy. But we do need a handle, just a better abstraction than a number.

[identity profile] juan-gandhi.livejournal.com 2015-10-02 12:33 am (UTC)(link)
Sure. It does not matter how we retrieve values via pointers.

[identity profile] soonts.livejournal.com 2015-10-02 12:07 am (UTC)(link)
>passing around numerical ids of entities instead of entity references
Numerical ids are much better.

>It's like 'error code'.
No it’s not.

You can store them in a CPU register.
You can save them using only 8 bytes of storage.
You can compare them with a single cmp CPU instruction.
You can design your system so the IDs are unique even if your system will scale out to thousands of servers.

[identity profile] juan-gandhi.livejournal.com 2015-10-02 12:36 am (UTC)(link)
Oh shit, never thought about CPU registers somewhere inside my phone.

And I never suggested to pass around internal references outside. This may be an interesting problem, passing around internal references; even a numerical id is an internal reference if we look from outside. I routinely talk to our analysts like this - "did it work now for account number 10567"? Now imagine... well, we all know how it is. Phone number (why the fuck it is still a number?), SSN, card number (why number?), etc.

(no subject)

[identity profile] rezkiy.livejournal.com - 2015-10-02 01:08 (UTC) - Expand

(no subject)

[identity profile] soonts.livejournal.com - 2015-10-02 01:28 (UTC) - Expand

(no subject)

[identity profile] juan-gandhi.livejournal.com - 2015-10-02 05:00 (UTC) - Expand

(no subject)

[identity profile] sassa-nf.livejournal.com - 2015-10-02 07:09 (UTC) - Expand

(no subject)

[identity profile] yatur.livejournal.com - 2015-10-02 04:45 (UTC) - Expand

(no subject)

[identity profile] juan-gandhi.livejournal.com - 2015-10-02 04:56 (UTC) - Expand

(no subject)

[identity profile] sassa-nf.livejournal.com - 2015-10-02 07:07 (UTC) - Expand

[identity profile] yatur.livejournal.com 2015-10-02 04:32 am (UTC)(link)
> You can store them in a CPU register.
> You can save them using only 8 bytes of storage.
> You can compare them with a single cmp CPU instruction.

These are all "C" concerns. The difference is that for Vlad "C" language (and a CPU register) sounds like something from the age of pyramids, while for embedded developers this is painful present.

> You can design your system so the IDs are unique

This does not really say something in favor or against passing "naked" IDs vs lazy entities. You can pass around lazy entities with unique IDs.

Edited 2015-10-02 04:33 (UTC)

[identity profile] yatur.livejournal.com 2015-10-02 03:59 am (UTC)(link)
I think C programming has nothing (or little) to do with it. Access to (relational) databases via lazy objects references is very harmful to performance.

Databases are designed to get mass quantities of records in one shot. E.g. retrieving 10,000 records via some simple query (SELECT * FROM Users WHERE OrganizationId=125) is virtually instantaneous. Lazy entity references are designed to get one object at a time. Retrieving 10,000 records via 10,000 queries (SELECT * FROM Users WHERE UserId=n x10,000 times) will take forever.

ORMs like Hibernate do provide lazy references, and make things look easy on toy databases. However, this ease is deceiving. Later on you find yourself rewriting half of the program getting rid of the lazy references, because it just does not work fast enough in production.
Edited 2015-10-02 04:03 (UTC)

[identity profile] sleepy-drago.livejournal.com 2015-10-02 06:27 am (UTC)(link)
You saying it like ppl dont know about "Later on...". Pro know exactly when to cash their stock and move on from a startup to greener pastures.

C programmers have other sins. Often they mix numbers that are identifiers for different things. I once coined for myself a label after working some legacy codebase: "program composed from integers". If highest abstraction in a code you have to ship is integer you know you're screwed.
Edited 2015-10-02 06:29 (UTC)

[identity profile] metaclass.livejournal.com 2015-10-02 10:03 am (UTC)(link)
If we have sufficiently strong type system and adequate language we can represent lazy reference as eight byte long id (and still has 2-3 lower bits for some flags, due to alignment).
But what we have are layers and layers of performance-killing wrappers like in (N)Hibernate and it`s much easier to work with simple types (long,int,whatever) in record fields.

Also these wrappers are not actually transparent. Simple long id cannot cause a LazyInitializationException, unlike some wrapper with reference to long-closed Session.

[identity profile] juan-gandhi.livejournal.com 2015-10-02 02:42 pm (UTC)(link)
Weird idea to issue 10k queries for retrieving 10k records.

(no subject)

[identity profile] yatur.livejournal.com - 2015-10-02 14:56 (UTC) - Expand

(no subject)

[identity profile] juan-gandhi.livejournal.com - 2015-10-02 15:07 (UTC) - Expand

[identity profile] microcell.livejournal.com 2015-10-02 12:13 pm (UTC)(link)
You actually never pass real string to the function. Instead what is happening, you are creating new string object in a heap and pass its pointer to the function through the stack. So in a way there is no much difference, but will be less work for a garbage collector.

If all your strings are static and already loaded in some array, the ID will save you trouble to load them each time. Or, if you have some resource provider, like Android does, it will let you to have multi-language support with no cost at all.
Edited 2015-10-02 12:14 (UTC)

[identity profile] juan-gandhi.livejournal.com 2015-10-02 02:40 pm (UTC)(link)
I don't know what you are talking about. Creating a new string object in a heap? In JVM? To pass a string to a function? Never heard of this.

(no subject)

[identity profile] sassa-nf.livejournal.com - 2015-10-02 16:49 (UTC) - Expand

(no subject)

[identity profile] microcell.livejournal.com - 2015-10-03 01:26 (UTC) - Expand

(no subject)

[identity profile] juan-gandhi.livejournal.com - 2015-10-03 03:45 (UTC) - Expand

(no subject)

[identity profile] microcell.livejournal.com - 2015-10-03 05:52 (UTC) - Expand

[identity profile] sorhed.livejournal.com 2015-10-02 12:29 pm (UTC)(link)
Разумеется, это самоочевидная истина. У явных id есть очень немного разумных применений, одно из них такое: звонит, допустим, клиент в саппорт и спрашивает, а что у меня ордер выполнился на два пункта дороже чем надо? А ему в ответ: ща посмотрим, а скажите-ка ID?

(из чего следует, что id для этих целей должен быть красивым и человекочитаемым, например, 137-234-987-202. И вообще генерироваться отдельно, не совпадать с внутренним id, который, как правило, UUIDv4, коих should be enough for everyone).

Ну и ещё из джаваскрипта в скалу приходится id явно гонять, референс же полностью не воссоздать. А так, внутри скального кода, конечно, никаких явных id, кашрут не велит.

[identity profile] juan-gandhi.livejournal.com 2015-10-02 02:39 pm (UTC)(link)
Я на самом деле описал три слоя.

1. id вместо сообщения об ошибке (содержащего конкретные детали)
2. перепасовка целых чисел в коде, вместо type-safe конкретных указателей на объекты
3. использование целых чисел для идентификации явлений природы

По третьему вопросу - конечно, нужен способ; но почему именно десятичные цифры, я не понимаю. base36 для меня звучало бы более осмысленно. BC-4KSNL, то ли дело. Запоминается мгновенно, как мем.

(no subject)

[identity profile] microcell.livejournal.com - 2015-10-03 01:56 (UTC) - Expand

id vs pointer

[identity profile] a r (from livejournal.com) 2015-10-02 01:36 pm (UTC)(link)
1. Id itself is immutable
2. It does not have to live on the same cpu.
3. And there are no life time issues - stale ids are trivially detected.

Edited 2015-10-02 13:37 (UTC)

Re: id vs pointer

[identity profile] sassa-nf.livejournal.com 2015-10-02 04:51 pm (UTC)(link)
1. someone needs to maintain a map for Id to memory address
3. tell me about it

[identity profile] 109.livejournal.com 2015-10-04 09:17 pm (UTC)(link)
с содроганием прочитал этот тред. особенно про преимущества int перед long.

> In short. It's stupid to pass around "ids" in a program.

не очень осмотрительно представлять свой очень специфический use case как "всегда" и "везде". это очевидный bias, но надо же себя заставлять.

вообще меня удивляет, как ты дожил до преклонных лет, а до сих пор не знаешь, что правильный ответ на любой вопрос - it depends.

например, я в основном работаю с данными, которые не влезают в память одной машины любого осмысленного размера. мысли создавать врапперы над числовыми id в такой обстановке даже зародиться неоткуда.
Edited 2015-10-05 01:54 (UTC)