Added an article about common lisp
Some checks failed
Build & Deploy / build-or-sth (push) Failing after 21s

This commit is contained in:
Emin Arslan 2025-01-25 19:02:33 +03:00
parent 20931357a2
commit c015447fcd

View File

@ -0,0 +1,376 @@
+++
title = 'SICP takeaways'
summary = 'A look into Common Lisp, what I've learnt from SICP, and a naive struct implementation.'
date = 2025-01-24T23:53:38+03:00
draft = false
+++
When I first started learning Common Lisp, I didn't know what I was getting
into. I thought it would just be a fun adventure, maybe a couple weeks of fun,
in and out.
Alas, it was not so. I was stunned by the sheer amount of power a couple of
features - features that aren't even *that* crazy by themselves, in retrospect -
could provide to a programming language. It was unlike anything I'd ever seen.
Experienced lispers, of course, should know exactly what I'm talking about: Homoiconicity
and ***true macros***.
I was in love. I didn't want to admit it, however. Perhaps this is overly dramatic, but
our love was a forbidden one. I didn't want to be seen using a language no-one uses.
I knew that for the sake of having a carreer, it was best that I stick to programming
languages everyone else was using: C, C++, Java, and so on. After all, the millions
of people using these languages couldn't be wrong, could they?
In the end, I couldn't do it. I was too weak, or perhaps Lisp's allure too strong.
I gave in, and installed SBCL, Emacs and SLIME once more. Thus, I was once again
in the vicinity of divinity.
Then, I started working through an absolute classic: Structure and Interpretation
of Computer Programs. A book about programming, using Scheme (a dialect of Lisp)
as its main language. Not Common Lisp, but a Lisp nonetheless.
Scheme has its own goodies too, after all: a hygienic macro system,
a thinner standard library (although perhaps a little *too* thin), tail calls
being required by the standard (though most CL implementations provide it anyway)...
In this post, I will ramble/talk about data abstraction and the greatness of macros.
I will implement a small, very simple, no-inheritence object system built out of
nothing but `cons` cells and a handful of macros. This system will definitely not
be as complete as the Common Lisp Object System. Its only purpose is to demonstrate
that such a thing is ***possible***.
I will assume that you know a little bit about Lisp code, or - at the very least -
you are willing to try to follow along anyway.
I will not be providing a full introduction to Lisp, but please don't be discouraged.
## What is a cons?
Simply put, a cons is a pair. Just a pair of two objects. The first element
is called the `car`, second element the `cdr` (the names are this way purely
for historical reasons). In C terms, a cons is effectively equivalent to:
```c
struct cons {
OBJECT car;
OBJECT cdr;
};
```
Except, with Lisp syntax. So we would make a new cons with the `cons` function,
like `(cons 1 2)` making a "cons cell" that contains 1 and 2.
The important thing here, is that this satisfies the closure property. Meaning,
one (or both) of the elements can themselves be cons cells.
So you could do: `(cons 1 (cons 2 (cons 3 nil)))` (nil denotes an empty list).
You may notice that this structure is suspiciously similar to a singly linked list.
The `car` of a cons is the lists first element, `cdr` gets you the rest of the list.
Indeed, this is how lists are implemented in lisp. They are singly linked lists.
`cons` cells are deceptively simple. You can build any number of interesting structures
out of them, trees, alists, plists etc. In theory, we should be able to implement,
say, a C-style struct with this as well.
## Implementing structures
Think about what an object is, for a bit. An object is an instance of a class,
and a class itself is just an interface for accessing parts of that object,
and manipulating it in various ways.
This means that, in theory, you can have *any* kind of representation "under the hood", as long
as your language provides uniform ways to access, manipulate and modify these objects.
In C, structs are just descriptions of how to extract information from a
particular array of bytes. As I said, however, as long as you're consistent
about how you store and retreive the information in a struct, you can implement
it however you want.
Notably, since we're using common lisp, all accesses to a field of an object always
look like function calls anyway. This is useful for a lot of reasons, but in this particular
case, it's useful mainly because field accesses aren't (or don't have to be) a special
operation provided by the programming language. They ***absolutely can*** be defined
as regular functions. (except for setters, which we will define in terms of `defsetf`,
but that's not that much different, promise).
That property is exactly what we will rely on here. We can make a macro, say, `mydefstruct`,
that takes a name for our struct and a list of its fields. Then, if this macro
defined a function to create that struct, and accessor functions (getter/setter for those of
you in the Java world) for all of its
fields, that would be a good-enough implementation of structs. Client code does
not have to care that your structures are all linked lists under the hood,
their code behaves as if these structs were just an integral part of the language.
Then, we could implement methods by switching on the type of the first element
of a defined method, and calling the appropriate actual methods. Voila! Object
oriented programming with very little language support. More sophisticated
systems can also be built in a similar manner, e.g. read-only fields could
be achieved by having the macro *not* define certain methods based on the input.
But that's beyond the scope of this blog post.
## First things first
Let's first define a few helper functions for our implementation. For one,
we need an easy way to get the symbol for a struct's constructor function:
```cl
(defun constructor-name (sym)
(intern (concatenate 'string "MAKE-" (string sym))))
```
Similar things for its general accessor (which will be used for getting the
value and setting it with `setf`) and its setter (which will only be used
for implement the `setf` form with `defsetf`).
```cl
(defun accessor-name (name sym)
(intern (concatenate 'string (string name) "-" (string sym))))
(defun setter-name (name sym)
(intern (concatenate 'string "SET-" (string name) "-" (string sym))))
```
Now we can write functions for defining:
- the constructor, with a function that takes a name,
and a list of slots, and returns a form that will define
the constructor when evaluated:
```cl
(defun constructor (name slots)
`(defun ,(constructor-name name) ,slots
(list ,@slots)))
```
- the slot accessors. This one will return a list of forms,
that will each define an accessor for one of the slots.
```cl
(defun accessors (name slots)
(loop for slot in slots
for i upfrom 0 collect
`(defun ,(accessor-name name slot) (obj)
(nth i obj))))
```
- the slot setters. Note that these won't actually
be used by the users of our library. In common lisp, we don't
really use separate functions for setters. For example,
if you can access a field through `(point-x my-point-object)`,
then you usually don't define a `set-point-x` function, but rather
use `(setf (point-x my-point-object) some-value)` to set it to `some-value`.
`setf` is another macro that actually expands this code into the appropriate
setter function. This provides a unified interface for accessing fields,
no matter what the underlying implementation is. Anyway, here's
my function for defining the setters:
```cl
(defun setters (name slots)
(loop for slot in slots
for i upfrom 0 collect
`(defun ,(setter-name name slot) (obj val)
(setf (nth i obj) val))))
```
- finally, the aforementioned `defsetf` forms:
```cl
(defun setfers (name slots)
(loop for slot in slots collect
`(defsetf ,(accessor-name name slot)
,(setter-name name slot))))
```
As Common Lisp is a highly interactive language, we can try each
of these functions in the REPL with very little effort:
```
CL-USER> (constructor 'point '(x y))
(DEFUN MAKE-POINT (X Y) (LIST X Y))
CL-USER> (accessors 'point '(x y))
((DEFUN POINT-X (OBJ) (NTH 0 OBJ))
(DEFUN POINT-Y (OBJ) (NTH 1 OBJ)))
CL-USER> (setters 'point '(x y))
((DEFUN SET-POINT-X (OBJ VAL) (SETF (NTH 0 OBJ) VAL))
(DEFUN SET-POINT-Y (OBJ VAL) (SETF (NTH 1 OBJ) VAL)))
CL-USER> (setfers 'point '(x y))
((DEFSETF POINT-X SET-POINT-X)
(DEFSETF POINT-Y SET-POINT-Y))
```
Wow, the code generated by our functions looks good! Now
we just need a macro to tie it all together, and we will have
a pretty good first implementation for structs.
As you can see, there isn't any trick to the macro itself, it just
takes its (unevaluated) arguments, and generates the code that
will be evaluated by calling the functions we defined earlier.
(Note the use of `,@` to splice the lists returned by `accessors`,
`setters`, and `setfers`).
```cl
(defmacro mydefstruct (name &rest slots)
`(progn
,(constructor name slots)
,@ (accessors name slots)
,@ (setters name slots)
,@ (setfers name slots)))
```
As you may have noticed, perhaps the greates strength of CL macros
is that they are, themselves, written *in lisp*. Which is why
we were so easily able to approach the problem of defining structs,
as a problem of generating code - and we were able to write regular
lisp code that generates the code we want, finally putting it in a
macro to achieve our goal.
A demonstration of how to define a struct with this, and use it:
```cl
(mydefstruct point x y)
;; this doesn't interact with CL's actual object system, but is still cool.
(defvar origin (make-point 0 0))
(point-x origin)
;; => 0
;; modifying an object
(defvar point1 (make-point 10 100))
(setf (point-x point1) 100) ;=> 100
(setf (point-y point1) 200) ;=> 200
(point-x point1) ; => 100
(point-y point1) ; => 200
```
Voila. Very, very simple system to define structs, without needing
any primitive for combining objects other than a pair.
These structs can have any number of fields, mind you. I just chose a simple
one to demonstrate.
## Type tags
One problem with this current implementation is that objects have no type information
at all. This means you could pass *any* struct with two elements as a `point` in the above
example. This can be useful in some cases, I'm sure. A broken clock is right twice a day
after all... but in general, I think it's safe to say that this behaviour is undesirable.
Instead, we want our getter and setter functions to give an error when passing a value
that *is not* a struct of the expected type. This will prevent many bugs by making sure
all type conversions are explicit, and no type is implicitly cast into another unrelated
type without the programmer's knowledge.
Of course, it would also help for a programmer to be able to inspect what type a particular
object belongs to. This is helpful because you might need to inspect such an object
at the REPL, and it might also be helpful in case you need a function to be able to
return several different types of objects, and check which one was actually returned.
There are many ways to implement this. We will be using a very simple solution: type tagging.
Essentially, we keep an extra element in the list underlying a struct - a tag that indicates
its type. We could store this as a string, or perhaps a unique integer generated every time
a struct is defined. However, since we're using common lisp, I think its perfectly appropriate
for us to use a symbol as the tag. (don't worry, unlike string comparisons, this shouldn't
incur much of a performance penalty. symbols are always interned in CL, so this *should*
be just a pointer comparison).
So, we just need to modify the code such that the first element of the list is the type tag.
First, the constructor:
```cl
(defun constructor (name slots)
`(defun ,(constructor-name name) ,slots
(list ',name ,@slots)))
```
No groundbreaking changes, really. We just add the name of the struct as a symbol to the front of the list.
This means that the first field of the struct now begins at index 1, however, so we need to update
our accessor and setters to match that. Since we also want our functions to perform type checking
at runtime, we should also add code for that into the generated accessor and setters.
Since every struct created with our new constructors contains type information, I think it would be
nice to add a helper function to get the type of an object. This way if we change the implementation
later we can just change this function without having to change every piece of code that checks
for type.
```cl
(defun obj-type (obj)
(car obj))
```
Since we're adding type checks, we may as well put in a little more effort
and give the user a nice error message telling them what type was expected,
and what type was given. For that, we'll make another helper:
```cl
(defun make-error-message (real expected)
(format nil "Accessor called on wrong type! Expected ~a but found ~a"
expected real))
```
Then we can add the type checks to our existing functions, like so:
```cl
(defun accessors (name slots)
(loop for slot in slots
for i upfrom 1 collect
`(defun ,(accessor-name name slot) (obj)
(if (eql (obj-type obj) ',name)
(nth ,i obj)
(error (make-error-message (obj-type obj) ',name))))))
(defun setters (name slots)
(loop for slot in slots
for i upfrom 1 collect
`(defun ,(setter-name name slot) (obj val)
(if (eql (obj-type obj) ',name)
(setf (nth ,i obj) val)
(error (make-error-message (obj-type obj) ',name))))))
```
We don't really need to change anything else.
With that, our type checking struct implementation is reasonable usable.
At least for a primitive system built out of a macro and some lists,
it's actually fairly good.
The only thing left is to wrap it in a package, and only export `mydefstruct`.
```cl
(defpackage :my-structures
(:use :cl)
(:export #:mydefstruct))
(in-package :my-structures)
```
There we go. Now our package is very nicely encapsulated, and only the useful
stuff is exported out of our package.
## Conclusion
Common Lisp's macros are truly amazing. We just created an entire system for
automatically defining new abstractions over data - and it looks, behaves and
feels just like it is part of the language, rather than something we added.
(apart from being rather barebones, and not providing much in the form
of printing and reading our structures, this is actually fairly similar to
Common Lisp's standard `defstruct` in terms of what it provides.
Of course the standard `defstruct` is much better than this, but that's
besides the point).
Side note:
Unfortunately, it doesn't really interact with the Common Lisp Object System
at all. This is to be expected, I'm just writing this to prove a point and to
demonstrate what I've learnt so far from SICP, not to replace something that
needs no replacing.
However, even though this system is not as good as the standard tools for
data abstraction, I think it's still a great demonstration of the language's
strengths.
The really stunning part for me, is that it was so *easy to do*. Too easy.
I actually hesitated to write about it on my blog, because it wasn't
really a challange. I created a replacement system for the language's
standard way of creating data structures, and *it was so easy to do*, I'm
*hesitating to write about it*. It's an inferior replacement, sure,
but it's still perfectly functional.
It amazes me to no end that you can straight-up rewrite a significant
portion of the language in itself, and you can just change it however
you want to. I couldn't imagine doing anything even remotely similar
to that in, say, Java or C.
I hope you were entertained by this attempt at reinventing the wheel.
I certainly enjoyed making it.