It has been about three years since I first sat in a hotel in Lithuania throwing together the first versions of purescript-erl-pinto and purescript-erl-stetson so that we could get started on a project for one of our clients.
Quite a lot of code has been written against those projects internally by colleagues and myself and over time:
The purerl cookbook has been updated for these latest releases, as has the demo-ps but felt it worthwhile highlighting some of the changes in a few (of my increasingly rare) blog posts.
In this post we'll be looking at Processes and the concept of 'self'
A common practise in Erlang codebases is to spawn a new process and use its Pid for communication.
example() ->
Pid = spawn_link(fun receive_message/0)
Pid ! hi.
receive_message() ->
receive
SomeMessage ->
?PRINT("Got a message"),
ok
end.
In Purerl, the type Pid lives in Erl.Process.Raw and is just an imported foreign type
foreign import data Pid :: Type
This typically isn't used to any great amount except in some specific FFI cases, it being far better generally to use the types found in Erl.Process which have the phantom type msg floating around meaning all the sends and receives are limited to the types of message that that process has declared it will receive. Process msg itself is just a newtype around Raw.Pid of course.
newtype Process (a :: Type)
= Process Raw.Pid
To create a typed process, one could call spawnLink and provide a callback which will be executed in the process created in that underlying call to spawn_link in Erlang. A change made fairly early on in development was to change this callback from something that took some context, to something that operated inside ProcessM msg r, with the context being provided by that monad - this will become important later on in this blog entry so I'll demonstrate this here.
"Given a callback that accepts a SpawnLinkContext typed around msg, run that callback inside a new process and return that new process, also typed around msg"
spawnLink :: forall msg. (SpawnLinkContext msg -> Effect Unit) -> Effect (Process msg)
That context then provided the means of receiving messages, being defined as something like
type SpawnLinkContext msg =
{ receive :: Effect msg
, receiveWithTimeout :: Timeout -> Effect (Maybe msg)
}
Thus, the Erlang example, re-written in Purerl would have looked something like this
example :: Effect Unit
example = do
pid <- spawnLink receiveMessage
pid ! "hi"
receiveMessage :: SpawnLinkContext String -> Effect Unit
receiveMessage c = do
msg <- c.receive
log "Got a message"
"Evaluate the given code in the context of a ProcessM typed around msg"
spawnLink :: forall msg. ProcessM msg Unit -> Effect (Process msg)
And quite simply, any calls to receive/etc are defined as functions that operate inside ProcessM, again all typed around msg
receive :: forall msg. ProcessM msg msg
receive = ProcessM Raw.receive
This cuts down on the cruft somewhat, as instead of having to pass a context everywhere, one can simply write functions in the context of ProcessM, re-writing that initial example now looks like this (Note that the log call needs lifting into ProcessM because it is written as an Effect)
example :: Effect Unit
example = do
pid <- spawnLink receiveMessage
pid ! "hi"
receiveMessage :: ProcessM String Unit
receiveMessage = do
msg <- receive
liftEffect $ log "Got a message"
It turns out that this is quite a nice pattern for representing the different types of process available in Erlang (OTP, Cowboy and indeed our own application code), consider:
All of these could be ran as ProcessM, except they have more types associated with them and various functions available designed for use in those specific contexts, for example in the simple case of cowboy..
type WebSocketInfoHandler msg state
= msg -> state -> WebSocketResult msg (WebSocketCallResult state)
We have the types msg and state in our type because the callbacks involved tend to take state and there is a callback (info) for messages received by the loop handler typed around msg
In the more complicated case of an OTP GenServer, this looks like this
type CallFn reply cont stop msg state
= From reply -> state -> ResultT cont stop msg state (CallResult reply cont stop state)
Most operations take place inside that ResultT which encodes the cont, stop, msg, and state types for use with our operations. (cont being the message that can be received by handle_continue, stop being a custom stop reason, msg being messages received by handle_info and state being the state of the gen server).
We also end up with our own contexts in our own codebases for specific process types around common units of business logic.
In most of these cases, just like with Process.spawnLink, something gets returned that represents the started process - for example the GenServer.
startLink :: forall cont stop msg state.
(ServerSpec cont stop msg state)
-> Effect (StartLinkResult (ServerPid cont stop msg state))
Here we have a ServerPid cont stop msg state returned to the caller - again keeping a lot of useful information around to help us make calls into a GenServer but the type we're interested in here is msg. Most generic APIs will be written around the concept of a Process msg and what we have here is a ServerPid cont stop msg state
The logical step here is to expose
toProcess :: ServerPid cont stop msg state -> Process msg
It turns out that this is a very common operation and so a typeclass is born and added to Erl.Process for everybody to implement when writing this kind of code.
class HasProcess b a where
getProcess :: a -> Process b
In fact, two typeclasses are born because some APIs only need a Pid after all.
class HasPid a where
getPid :: a -> Pid
Most functions that take a Process msg therefore don't actually care about it being a Process msg, but only that a Process msg can be gotten from the type
callMe :: forall p msg.
HasProcess msg p =>
p -> Effect Unit
And now all of those custom types can be used with a whole suite of APIs without having to unpack a convoluted structure of newtypes.
There is one more common operation that has been ignored so far, and that is the concept of self. An incredibly common thing in Erlang is to invoke self to get the Pid of the current process.
Self = self(),
some_api:call_me(Self)
For a while, we started having self methods on every module that exported some monad in which process logic could be evalulated, and this would return the full type of the process (complete with cont, stop, state, etc). There were a lot of self functions being exported and imported and in 99.99% of all cases they were immediately followed by a call to getProcess using the HasProcess typeclass implementation for that system.
(me :: Process MyMsg) <- getProcess <$> GenServer.self
What does self mean then? The correct answer is as written above, but the correct answer isn't always the nicest answer - it was very rare that we would need anything from self other than the current Process msg and we were running into issues in modules that had code for more than one of these contexts in them, whose self are we using anyway?
The answer was to be pragmatic and create a typeclass for 'any m' that allowed that 'm' to export a Process msg
class HasSelf (m :: Type -> Type) msg | m -> msg where
self :: m (Process msg)
For the case of anything running inside a ProcessM, this isn't any more complicated than calling Raw.self to get the current pid and wrapping it up with the relevant newtypes
instance selfProcessM :: HasSelf (ProcessM a) a where
self :: forall a. ProcessM a (Process a)
self = ProcessM $ Process <$> Raw.self
Similar implementations then exist for Pinto/Stetson contexts, allowing code to simply call Process.self from practically anywhere to get a typed Process msg valid for the current context.
SomeApi.callMe <<< self
All typed, all safe, nobody sent messages they can't handle - living the dream.
2020 © Rob Ashton. ALL Rights Reserved.