Added an article about common lisp
Some checks failed
Build & Deploy / build-or-sth (push) Failing after 21s
Some checks failed
Build & Deploy / build-or-sth (push) Failing after 21s
This commit is contained in:
parent
20931357a2
commit
c015447fcd
376
content/posts/naive_classes.md
Normal file
376
content/posts/naive_classes.md
Normal file
@ -0,0 +1,376 @@
|
|||||||
|
+++
|
||||||
|
title = 'SICP takeaways'
|
||||||
|
summary = 'A look into Common Lisp, what I've learnt from SICP, and a naive struct implementation.'
|
||||||
|
date = 2025-01-24T23:53:38+03:00
|
||||||
|
draft = false
|
||||||
|
+++
|
||||||
|
|
||||||
|
When I first started learning Common Lisp, I didn't know what I was getting
|
||||||
|
into. I thought it would just be a fun adventure, maybe a couple weeks of fun,
|
||||||
|
in and out.
|
||||||
|
|
||||||
|
Alas, it was not so. I was stunned by the sheer amount of power a couple of
|
||||||
|
features - features that aren't even *that* crazy by themselves, in retrospect -
|
||||||
|
could provide to a programming language. It was unlike anything I'd ever seen.
|
||||||
|
|
||||||
|
Experienced lispers, of course, should know exactly what I'm talking about: Homoiconicity
|
||||||
|
and ***true macros***.
|
||||||
|
|
||||||
|
I was in love. I didn't want to admit it, however. Perhaps this is overly dramatic, but
|
||||||
|
our love was a forbidden one. I didn't want to be seen using a language no-one uses.
|
||||||
|
I knew that for the sake of having a carreer, it was best that I stick to programming
|
||||||
|
languages everyone else was using: C, C++, Java, and so on. After all, the millions
|
||||||
|
of people using these languages couldn't be wrong, could they?
|
||||||
|
|
||||||
|
In the end, I couldn't do it. I was too weak, or perhaps Lisp's allure too strong.
|
||||||
|
I gave in, and installed SBCL, Emacs and SLIME once more. Thus, I was once again
|
||||||
|
in the vicinity of divinity.
|
||||||
|
|
||||||
|
Then, I started working through an absolute classic: Structure and Interpretation
|
||||||
|
of Computer Programs. A book about programming, using Scheme (a dialect of Lisp)
|
||||||
|
as its main language. Not Common Lisp, but a Lisp nonetheless.
|
||||||
|
Scheme has its own goodies too, after all: a hygienic macro system,
|
||||||
|
a thinner standard library (although perhaps a little *too* thin), tail calls
|
||||||
|
being required by the standard (though most CL implementations provide it anyway)...
|
||||||
|
|
||||||
|
In this post, I will ramble/talk about data abstraction and the greatness of macros.
|
||||||
|
I will implement a small, very simple, no-inheritence object system built out of
|
||||||
|
nothing but `cons` cells and a handful of macros. This system will definitely not
|
||||||
|
be as complete as the Common Lisp Object System. Its only purpose is to demonstrate
|
||||||
|
that such a thing is ***possible***.
|
||||||
|
|
||||||
|
I will assume that you know a little bit about Lisp code, or - at the very least -
|
||||||
|
you are willing to try to follow along anyway.
|
||||||
|
|
||||||
|
I will not be providing a full introduction to Lisp, but please don't be discouraged.
|
||||||
|
|
||||||
|
## What is a cons?
|
||||||
|
|
||||||
|
Simply put, a cons is a pair. Just a pair of two objects. The first element
|
||||||
|
is called the `car`, second element the `cdr` (the names are this way purely
|
||||||
|
for historical reasons). In C terms, a cons is effectively equivalent to:
|
||||||
|
|
||||||
|
```c
|
||||||
|
struct cons {
|
||||||
|
OBJECT car;
|
||||||
|
OBJECT cdr;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
Except, with Lisp syntax. So we would make a new cons with the `cons` function,
|
||||||
|
like `(cons 1 2)` making a "cons cell" that contains 1 and 2.
|
||||||
|
|
||||||
|
The important thing here, is that this satisfies the closure property. Meaning,
|
||||||
|
one (or both) of the elements can themselves be cons cells.
|
||||||
|
|
||||||
|
So you could do: `(cons 1 (cons 2 (cons 3 nil)))` (nil denotes an empty list).
|
||||||
|
You may notice that this structure is suspiciously similar to a singly linked list.
|
||||||
|
The `car` of a cons is the lists first element, `cdr` gets you the rest of the list.
|
||||||
|
Indeed, this is how lists are implemented in lisp. They are singly linked lists.
|
||||||
|
|
||||||
|
`cons` cells are deceptively simple. You can build any number of interesting structures
|
||||||
|
out of them, trees, alists, plists etc. In theory, we should be able to implement,
|
||||||
|
say, a C-style struct with this as well.
|
||||||
|
|
||||||
|
## Implementing structures
|
||||||
|
|
||||||
|
Think about what an object is, for a bit. An object is an instance of a class,
|
||||||
|
and a class itself is just an interface for accessing parts of that object,
|
||||||
|
and manipulating it in various ways.
|
||||||
|
|
||||||
|
This means that, in theory, you can have *any* kind of representation "under the hood", as long
|
||||||
|
as your language provides uniform ways to access, manipulate and modify these objects.
|
||||||
|
In C, structs are just descriptions of how to extract information from a
|
||||||
|
particular array of bytes. As I said, however, as long as you're consistent
|
||||||
|
about how you store and retreive the information in a struct, you can implement
|
||||||
|
it however you want.
|
||||||
|
|
||||||
|
Notably, since we're using common lisp, all accesses to a field of an object always
|
||||||
|
look like function calls anyway. This is useful for a lot of reasons, but in this particular
|
||||||
|
case, it's useful mainly because field accesses aren't (or don't have to be) a special
|
||||||
|
operation provided by the programming language. They ***absolutely can*** be defined
|
||||||
|
as regular functions. (except for setters, which we will define in terms of `defsetf`,
|
||||||
|
but that's not that much different, promise).
|
||||||
|
|
||||||
|
That property is exactly what we will rely on here. We can make a macro, say, `mydefstruct`,
|
||||||
|
that takes a name for our struct and a list of its fields. Then, if this macro
|
||||||
|
defined a function to create that struct, and accessor functions (getter/setter for those of
|
||||||
|
you in the Java world) for all of its
|
||||||
|
fields, that would be a good-enough implementation of structs. Client code does
|
||||||
|
not have to care that your structures are all linked lists under the hood,
|
||||||
|
their code behaves as if these structs were just an integral part of the language.
|
||||||
|
|
||||||
|
Then, we could implement methods by switching on the type of the first element
|
||||||
|
of a defined method, and calling the appropriate actual methods. Voila! Object
|
||||||
|
oriented programming with very little language support. More sophisticated
|
||||||
|
systems can also be built in a similar manner, e.g. read-only fields could
|
||||||
|
be achieved by having the macro *not* define certain methods based on the input.
|
||||||
|
But that's beyond the scope of this blog post.
|
||||||
|
|
||||||
|
## First things first
|
||||||
|
|
||||||
|
Let's first define a few helper functions for our implementation. For one,
|
||||||
|
we need an easy way to get the symbol for a struct's constructor function:
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defun constructor-name (sym)
|
||||||
|
(intern (concatenate 'string "MAKE-" (string sym))))
|
||||||
|
```
|
||||||
|
|
||||||
|
Similar things for its general accessor (which will be used for getting the
|
||||||
|
value and setting it with `setf`) and its setter (which will only be used
|
||||||
|
for implement the `setf` form with `defsetf`).
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defun accessor-name (name sym)
|
||||||
|
(intern (concatenate 'string (string name) "-" (string sym))))
|
||||||
|
|
||||||
|
(defun setter-name (name sym)
|
||||||
|
(intern (concatenate 'string "SET-" (string name) "-" (string sym))))
|
||||||
|
```
|
||||||
|
|
||||||
|
Now we can write functions for defining:
|
||||||
|
- the constructor, with a function that takes a name,
|
||||||
|
and a list of slots, and returns a form that will define
|
||||||
|
the constructor when evaluated:
|
||||||
|
```cl
|
||||||
|
(defun constructor (name slots)
|
||||||
|
`(defun ,(constructor-name name) ,slots
|
||||||
|
(list ,@slots)))
|
||||||
|
```
|
||||||
|
- the slot accessors. This one will return a list of forms,
|
||||||
|
that will each define an accessor for one of the slots.
|
||||||
|
```cl
|
||||||
|
(defun accessors (name slots)
|
||||||
|
(loop for slot in slots
|
||||||
|
for i upfrom 0 collect
|
||||||
|
`(defun ,(accessor-name name slot) (obj)
|
||||||
|
(nth i obj))))
|
||||||
|
```
|
||||||
|
- the slot setters. Note that these won't actually
|
||||||
|
be used by the users of our library. In common lisp, we don't
|
||||||
|
really use separate functions for setters. For example,
|
||||||
|
if you can access a field through `(point-x my-point-object)`,
|
||||||
|
then you usually don't define a `set-point-x` function, but rather
|
||||||
|
use `(setf (point-x my-point-object) some-value)` to set it to `some-value`.
|
||||||
|
`setf` is another macro that actually expands this code into the appropriate
|
||||||
|
setter function. This provides a unified interface for accessing fields,
|
||||||
|
no matter what the underlying implementation is. Anyway, here's
|
||||||
|
my function for defining the setters:
|
||||||
|
```cl
|
||||||
|
(defun setters (name slots)
|
||||||
|
(loop for slot in slots
|
||||||
|
for i upfrom 0 collect
|
||||||
|
`(defun ,(setter-name name slot) (obj val)
|
||||||
|
(setf (nth i obj) val))))
|
||||||
|
```
|
||||||
|
- finally, the aforementioned `defsetf` forms:
|
||||||
|
```cl
|
||||||
|
(defun setfers (name slots)
|
||||||
|
(loop for slot in slots collect
|
||||||
|
`(defsetf ,(accessor-name name slot)
|
||||||
|
,(setter-name name slot))))
|
||||||
|
```
|
||||||
|
|
||||||
|
As Common Lisp is a highly interactive language, we can try each
|
||||||
|
of these functions in the REPL with very little effort:
|
||||||
|
|
||||||
|
```
|
||||||
|
CL-USER> (constructor 'point '(x y))
|
||||||
|
(DEFUN MAKE-POINT (X Y) (LIST X Y))
|
||||||
|
|
||||||
|
CL-USER> (accessors 'point '(x y))
|
||||||
|
((DEFUN POINT-X (OBJ) (NTH 0 OBJ))
|
||||||
|
(DEFUN POINT-Y (OBJ) (NTH 1 OBJ)))
|
||||||
|
|
||||||
|
CL-USER> (setters 'point '(x y))
|
||||||
|
((DEFUN SET-POINT-X (OBJ VAL) (SETF (NTH 0 OBJ) VAL))
|
||||||
|
(DEFUN SET-POINT-Y (OBJ VAL) (SETF (NTH 1 OBJ) VAL)))
|
||||||
|
|
||||||
|
CL-USER> (setfers 'point '(x y))
|
||||||
|
((DEFSETF POINT-X SET-POINT-X)
|
||||||
|
(DEFSETF POINT-Y SET-POINT-Y))
|
||||||
|
```
|
||||||
|
|
||||||
|
Wow, the code generated by our functions looks good! Now
|
||||||
|
we just need a macro to tie it all together, and we will have
|
||||||
|
a pretty good first implementation for structs.
|
||||||
|
|
||||||
|
As you can see, there isn't any trick to the macro itself, it just
|
||||||
|
takes its (unevaluated) arguments, and generates the code that
|
||||||
|
will be evaluated by calling the functions we defined earlier.
|
||||||
|
(Note the use of `,@` to splice the lists returned by `accessors`,
|
||||||
|
`setters`, and `setfers`).
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defmacro mydefstruct (name &rest slots)
|
||||||
|
`(progn
|
||||||
|
,(constructor name slots)
|
||||||
|
,@ (accessors name slots)
|
||||||
|
,@ (setters name slots)
|
||||||
|
,@ (setfers name slots)))
|
||||||
|
```
|
||||||
|
|
||||||
|
As you may have noticed, perhaps the greates strength of CL macros
|
||||||
|
is that they are, themselves, written *in lisp*. Which is why
|
||||||
|
we were so easily able to approach the problem of defining structs,
|
||||||
|
as a problem of generating code - and we were able to write regular
|
||||||
|
lisp code that generates the code we want, finally putting it in a
|
||||||
|
macro to achieve our goal.
|
||||||
|
|
||||||
|
A demonstration of how to define a struct with this, and use it:
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(mydefstruct point x y)
|
||||||
|
;; this doesn't interact with CL's actual object system, but is still cool.
|
||||||
|
(defvar origin (make-point 0 0))
|
||||||
|
(point-x origin)
|
||||||
|
;; => 0
|
||||||
|
|
||||||
|
;; modifying an object
|
||||||
|
(defvar point1 (make-point 10 100))
|
||||||
|
(setf (point-x point1) 100) ;=> 100
|
||||||
|
(setf (point-y point1) 200) ;=> 200
|
||||||
|
(point-x point1) ; => 100
|
||||||
|
(point-y point1) ; => 200
|
||||||
|
```
|
||||||
|
|
||||||
|
Voila. Very, very simple system to define structs, without needing
|
||||||
|
any primitive for combining objects other than a pair.
|
||||||
|
|
||||||
|
These structs can have any number of fields, mind you. I just chose a simple
|
||||||
|
one to demonstrate.
|
||||||
|
|
||||||
|
## Type tags
|
||||||
|
|
||||||
|
One problem with this current implementation is that objects have no type information
|
||||||
|
at all. This means you could pass *any* struct with two elements as a `point` in the above
|
||||||
|
example. This can be useful in some cases, I'm sure. A broken clock is right twice a day
|
||||||
|
after all... but in general, I think it's safe to say that this behaviour is undesirable.
|
||||||
|
|
||||||
|
Instead, we want our getter and setter functions to give an error when passing a value
|
||||||
|
that *is not* a struct of the expected type. This will prevent many bugs by making sure
|
||||||
|
all type conversions are explicit, and no type is implicitly cast into another unrelated
|
||||||
|
type without the programmer's knowledge.
|
||||||
|
|
||||||
|
Of course, it would also help for a programmer to be able to inspect what type a particular
|
||||||
|
object belongs to. This is helpful because you might need to inspect such an object
|
||||||
|
at the REPL, and it might also be helpful in case you need a function to be able to
|
||||||
|
return several different types of objects, and check which one was actually returned.
|
||||||
|
|
||||||
|
There are many ways to implement this. We will be using a very simple solution: type tagging.
|
||||||
|
Essentially, we keep an extra element in the list underlying a struct - a tag that indicates
|
||||||
|
its type. We could store this as a string, or perhaps a unique integer generated every time
|
||||||
|
a struct is defined. However, since we're using common lisp, I think its perfectly appropriate
|
||||||
|
for us to use a symbol as the tag. (don't worry, unlike string comparisons, this shouldn't
|
||||||
|
incur much of a performance penalty. symbols are always interned in CL, so this *should*
|
||||||
|
be just a pointer comparison).
|
||||||
|
|
||||||
|
So, we just need to modify the code such that the first element of the list is the type tag.
|
||||||
|
|
||||||
|
First, the constructor:
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defun constructor (name slots)
|
||||||
|
`(defun ,(constructor-name name) ,slots
|
||||||
|
(list ',name ,@slots)))
|
||||||
|
```
|
||||||
|
|
||||||
|
No groundbreaking changes, really. We just add the name of the struct as a symbol to the front of the list.
|
||||||
|
This means that the first field of the struct now begins at index 1, however, so we need to update
|
||||||
|
our accessor and setters to match that. Since we also want our functions to perform type checking
|
||||||
|
at runtime, we should also add code for that into the generated accessor and setters.
|
||||||
|
|
||||||
|
Since every struct created with our new constructors contains type information, I think it would be
|
||||||
|
nice to add a helper function to get the type of an object. This way if we change the implementation
|
||||||
|
later we can just change this function without having to change every piece of code that checks
|
||||||
|
for type.
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defun obj-type (obj)
|
||||||
|
(car obj))
|
||||||
|
```
|
||||||
|
|
||||||
|
Since we're adding type checks, we may as well put in a little more effort
|
||||||
|
and give the user a nice error message telling them what type was expected,
|
||||||
|
and what type was given. For that, we'll make another helper:
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defun make-error-message (real expected)
|
||||||
|
(format nil "Accessor called on wrong type! Expected ~a but found ~a"
|
||||||
|
expected real))
|
||||||
|
```
|
||||||
|
|
||||||
|
Then we can add the type checks to our existing functions, like so:
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defun accessors (name slots)
|
||||||
|
(loop for slot in slots
|
||||||
|
for i upfrom 1 collect
|
||||||
|
`(defun ,(accessor-name name slot) (obj)
|
||||||
|
(if (eql (obj-type obj) ',name)
|
||||||
|
(nth ,i obj)
|
||||||
|
(error (make-error-message (obj-type obj) ',name))))))
|
||||||
|
(defun setters (name slots)
|
||||||
|
(loop for slot in slots
|
||||||
|
for i upfrom 1 collect
|
||||||
|
`(defun ,(setter-name name slot) (obj val)
|
||||||
|
(if (eql (obj-type obj) ',name)
|
||||||
|
(setf (nth ,i obj) val)
|
||||||
|
(error (make-error-message (obj-type obj) ',name))))))
|
||||||
|
```
|
||||||
|
|
||||||
|
We don't really need to change anything else.
|
||||||
|
|
||||||
|
With that, our type checking struct implementation is reasonable usable.
|
||||||
|
At least for a primitive system built out of a macro and some lists,
|
||||||
|
it's actually fairly good.
|
||||||
|
|
||||||
|
The only thing left is to wrap it in a package, and only export `mydefstruct`.
|
||||||
|
|
||||||
|
```cl
|
||||||
|
(defpackage :my-structures
|
||||||
|
(:use :cl)
|
||||||
|
(:export #:mydefstruct))
|
||||||
|
|
||||||
|
(in-package :my-structures)
|
||||||
|
```
|
||||||
|
|
||||||
|
There we go. Now our package is very nicely encapsulated, and only the useful
|
||||||
|
stuff is exported out of our package.
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
Common Lisp's macros are truly amazing. We just created an entire system for
|
||||||
|
automatically defining new abstractions over data - and it looks, behaves and
|
||||||
|
feels just like it is part of the language, rather than something we added.
|
||||||
|
(apart from being rather barebones, and not providing much in the form
|
||||||
|
of printing and reading our structures, this is actually fairly similar to
|
||||||
|
Common Lisp's standard `defstruct` in terms of what it provides.
|
||||||
|
Of course the standard `defstruct` is much better than this, but that's
|
||||||
|
besides the point).
|
||||||
|
|
||||||
|
Side note:
|
||||||
|
Unfortunately, it doesn't really interact with the Common Lisp Object System
|
||||||
|
at all. This is to be expected, I'm just writing this to prove a point and to
|
||||||
|
demonstrate what I've learnt so far from SICP, not to replace something that
|
||||||
|
needs no replacing.
|
||||||
|
|
||||||
|
However, even though this system is not as good as the standard tools for
|
||||||
|
data abstraction, I think it's still a great demonstration of the language's
|
||||||
|
strengths.
|
||||||
|
|
||||||
|
The really stunning part for me, is that it was so *easy to do*. Too easy.
|
||||||
|
I actually hesitated to write about it on my blog, because it wasn't
|
||||||
|
really a challange. I created a replacement system for the language's
|
||||||
|
standard way of creating data structures, and *it was so easy to do*, I'm
|
||||||
|
*hesitating to write about it*. It's an inferior replacement, sure,
|
||||||
|
but it's still perfectly functional.
|
||||||
|
|
||||||
|
It amazes me to no end that you can straight-up rewrite a significant
|
||||||
|
portion of the language in itself, and you can just change it however
|
||||||
|
you want to. I couldn't imagine doing anything even remotely similar
|
||||||
|
to that in, say, Java or C.
|
||||||
|
|
||||||
|
I hope you were entertained by this attempt at reinventing the wheel.
|
||||||
|
I certainly enjoyed making it.
|
Loading…
x
Reference in New Issue
Block a user