Build Your Own Haskell Compiler #2
為了讓大家可以簡單地寫 desugar function , C 使用 Template Haskell ,以便編譯時自動產生程式碼,輸出給 Haskell.Src.Exts
用的 desugar function template ,於是做 if
的 desugar 時,只要寫:
-- 注意:這是最早的版本,沒有使用帶更多資訊的 `Haskell.Src.Exts.Annotated`
deIfExp (If exp1 exp2 exp3) = deIfExp $
Case exp1
[ Alt (SrcLoc "" 0 0) (PApp (UnQual (Ident "False")) []) (UnGuardedRhs exp3) Nothing
, Alt (SrcLoc "" 0 0) (PApp (UnQual (Ident "True")) []) (UnGuardedRhs exp2) Nothing
]
省掉手寫處理其他 Syntax 的三百多行。
但身為一個連 Monoid, Functor, Applicative 都還搞不清楚的笨蛋,怎麼可能一下子就懂 TH (Template Haskell)在搞什麼呢?
所以來拆解 7e37b79 吧!看看 TH 到底是怎麼幫助我們的 XD
文末附上的是只處理 Haskell.Src.Exts.Syntax.Module
的精簡版,沒有 7e37b79 那麼複雜,以下一點一點說明:
deriveDesugarTemplate
程式的進入點是 deriveDesugarTemplate :: String -> Q [Dec]
,吃的 String
是之後用來生 desugar template 的 function 名稱,吐出來的 Q [Dec]
在 $()
裡面會變成程式,在 Main.hs 中這樣寫:
-- 前面有些 code
$(deriveDesugarTemplate "xx")
-- 後面有些 code
之後就有個叫 xx
的 function 可以用。寫 xx "DeIf" "deIf"
就會生出想要的三百多行程式。
deriveDesugarTemplate :: String -> Q [Dec]
deriveDesugarTemplate funName = do
Just moduleName <- lookupTypeName "Module"
moduleInfo <- reify moduleName
runIO $ putStrLn $ show moduleInfo
let code = genDataTransformer "deIf" moduleInfo
runIO $ putStrLn $ code
fmap pure $ funD (mkName funName) [clause (map (varP . mkName) ["modName", "funPrefix"]) (normalB (varE (mkName "undefined"))) []]
一開始的:
Just moduleName <- lookupTypeName "Module"
先從看得見的地方(在此例中是 Main.hs 和 DeriveDesugarTemplate import 的 modules),找出叫做 "Module"
的 Name
,再靠:
moduleInfo <- reify moduleName
把 Module 的形狀拆出來。
runIO
則可以幫我們在 Q
中印出 moduleInfo
,看看接下來該怎麼寫。
Language.Haskell.Src.Exts.Syntax.Module
Module :
Module SrcLoc ModuleName [ModulePragma] (Maybe WarningText) (Maybe [ExportSpec]) [ImportDecl] [Decl]
在 TH 中長這個樣子:
TyConI
( DataD
[] -- Cxt
Language.Haskell.Exts.Syntax.Module -- Name
[] -- [TyVarBndr]
[ NormalC -- [Con]
Language.Haskell.Exts.Syntax.Module
[ (NotStrict, ConT Language.Haskell.Exts.SrcLoc.SrcLoc) -- SrcLoc
, (NotStrict, ConT Language.Haskell.Exts.Syntax.ModuleName) -- ModuleName
, (NotStrict, AppT ListT (ConT Language.Haskell.Exts.Syntax.ModulePragma)) -- [ModulePragma]
, (NotStrict, AppT (ConT GHC.Base.Maybe) (ConT Language.Haskell.Exts.Syntax.WarningText)) -- Maybe WarningText
, (NotStrict, AppT (ConT GHC.Base.Maybe) (AppT ListT (ConT Language.Haskell.Exts.Syntax.ExportSpec))) -- Maybe [ExportSpec]
, (NotStrict, AppT ListT (ConT Language.Haskell.Exts.Syntax.ImportDecl)) -- [ImportDecl]
, (NotStrict, AppT ListT (ConT Language.Haskell.Exts.Syntax.Decl)) -- [Decl]
]
]
[] -- [Name]
)
我們關心的,主要是 [Con]
中那些 Types , genDataTransformer
會幫忙產生處理 Module 的 function 。
genDataTransformer
genDataTransformer
吃新函數的 prefix 和 Module 在 TH 中的形狀。可以拆成兩部分看,先看後半用 LambdaCase
寫的部分:
\case
TyConI (DataD [] (Name (OccName tyNameStr) _) [] cons _) ->
let
typeSig = funPrefix ++ tyNameStr ++ " :: " ++ tyNameStr ++ " -> " ++ tyNameStr ++ "\n"
in
concat $ typeSig : map (conToDef tyNameStr) cons
others -> error $ "genDataTransformer " ++ show others ++ " not implemented"
在 Module 的例子中, typeSig 會變成 "deIfModule :: Module -> Module\n"
,而下面的 map (conToDef tyNameStr) cons
生出來的就是 function body 。
conToDef tyNameStr (NormalC (Name (OccName conNameStr) _) slots) =
let
varNames = varNamesForNormalSlots slots
in
funPrefix ++ tyNameStr ++
" (" ++ intercalate " " (conNameStr : varNames) ++ ") = " ++
intercalate " " (conNameStr : zipWith (\varNameStr (_, ty) -> exprFromType varNameStr funPrefix ty) varNames slots) ++
"\n"
其中:
funPrefix ++ tyNameStr ++
" (" ++ intercalate " " (conNameStr : varNames) ++ ")"
會生出:
deIfModule (Module srcLoc moduleName modulePragma warningText exportSpec importDecl decl)
也就是說, varNamesForNormalSlots
會把 slots 收集起來,變成可以 pattern match 的樣子。
intercalate
在這裡做的事情只是在 name 之間插入空白,再接成 String 。
intercalate " " (conNameStr : zipWith (\varNameStr (_, ty) -> exprFromType varNameStr funPrefix ty) varNames slots)
會生出:
Module (deIfSrcLoc srcLoc) (deIfModuleName moduleName) (fmap (deIfModulePragma) modulePragma) (fmap (deIfWarningText) warningText) (fmap (fmap (deIfExportSpec)) exportSpec) (fmap (deIfImportDecl) importDecl) (fmap (deIfDecl) decl)
exprFromType
生出了遞迴遍歷各個 slot 需要寫的 code 。
varNamesForNormalSlots
finalNames = go M.empty rawNames where
go nameCount (nameStr : others) =
if S.member nameStr dupNames then
let
nameCount' = M.insertWith (+) nameStr 1 nameCount
serial = nameCount' M.! nameStr
in
(nameStr ++ show serial) : go nameCount' others
else
nameStr : go nameCount others
go _ _ = []
這段做的事情是,找出重複的名字,並在名字後面加上數字。這麼一來, Exp Exp Exp
就不會變成 exp exp exp
,而是可以區分彼此的 exp1 exp2 exp3
。先看現在的 nameStr
在不在放著重複名字的 dupNames
中,如果在的話,就用加上數字的名字(nameStr ++ show serial
),不然用普通的名字就好(nameStr
)。
dupNames :: S.Set String
dupNames = go S.empty S.empty rawNames where
go res seen (nameStr : others) =
if S.member nameStr seen then
go (S.insert nameStr res) seen others
else
go res (S.insert nameStr seen) others
go res _ _ = res
dupNames
也很好玩,一開始有兩個空的 Set ,前面的用來放重複的名字,後面的用來放看過的名字。從 rawNames
中把名字一個一個拿出來看,如果這個名字已經看過了(S.member nameStr seen
),就塞到 res
中;如果沒看過,就塞到 seen
中。
rawNames
又是怎麼來的?靠 varNameFromType 從 slots 裡面拆出來的:
rawNames = map (\(_, ty) -> case varNameFromType ty of Name (OccName str) _ -> lowerHead str) slots
拆出來後把字首小寫(lowerHead
)。
varNameFromType
varNameFromType :: Type -> Name
varNameFromType (ConT name) = name
varNameFromType (all @ (AppT f x)) = case maybeVarNameFromType all of
Just o -> o
Nothing -> varNameFromType x
varNameFromType others = error $ "varNameFromType " ++ show others ++ " not implemented"
拆 Name 出來時,有意思的是那個 maybeVarNameFromType
。會濾掉包裝或基本的 Type :
trivialTypes :: S.Set String
trivialTypes = S.fromList ["String", "Maybe", "Int", "Rational", "Char", "Integer", "Bool"]
做法是:
maybeVarNameFromType :: Type -> Maybe Name
maybevarNameFromType (ConT (name @ (Name (OccName nameStr) _)))
| S.member nameStr trivialTypes = Nothing
| otherwise = Just name
maybeVarNameFromType (AppT f x) = maybeVarNameFromType f <|> maybeVarNameFromType x
maybeVarNameFromType _ = Nothing
這樣就不用特地把 Maybe WarnText
寫成 mWarnText
,只要寫成 warnText
就好。該如何正確地處理它們,則交給 exprFromType
。
exprFromType 和 transExprFromType
exprFromType :: String -> String -> Type -> String
exprFromType name funPrefix ty = "(" ++ transExprFromType funPrefix ty ++ " " ++ name ++ ")"
exprFromType
套上了最外面的 ()
與目標物(name
)。重點在:
transExprFromType :: String -> Type -> String
transExprFromType funPrefix = genExpr where
genExpr = \case
ConT (Name (OccName name) _)
| S.member name trivialTypes -> "id"
| otherwise -> funPrefix ++ name
AppT ListT x -> "fmap (" ++ genExpr x ++ ")"
AppT (ConT (Name (OccName "Maybe") _)) x -> "fmap (" ++ genExpr x ++ ")"
AppT (AppT (TupleT 2) a) b -> "(" ++ genExpr a ++ ") *** (" ++ genExpr b ++ ")"
others -> error $ "exprFromType " ++ show others ++ " not implemented"
這邊才真的看每個 Type 的形狀,決定該不該加 fmap
,該加幾層。
那個 ***
是 Control.Arrow
裡的東西,又提醒我,得把看到一半的文章看完,這樣才有機會搞懂 FRP 。
簡化過的 DerivieDesugarTemplate.hs 全文
module DeriveTemplate where
import Data.Functor
import Data.List
import Data.Char
import Data.Monoid
import Control.Applicative
import Control.Monad
import qualified Data.Set as S
import qualified Data.Map as M
import Language.Haskell.TH
import Language.Haskell.TH.Syntax
trivialTypes :: S.Set String
trivialTypes = S.fromList ["String", "Maybe", "Int", "Rational", "Char", "Integer", "Bool"]
deriveDesugarTemplate :: String -> Q [Dec]
deriveDesugarTemplate funName = do
Just moduleName <- lookupTypeName "Module"
moduleInfo <- reify moduleName
runIO $ putStrLn $ show moduleInfo
let code = genDataTransformer "deIf" moduleInfo
runIO $ putStrLn $ code
fmap pure $ funD (mkName funName) [clause (map (varP . mkName) ["modName", "funPrefix"]) (normalB (varE (mkName "undefined"))) []]
maybeVarNameFromType :: Type -> Maybe Name
maybevarNameFromType (ConT (name @ (Name (OccName nameStr) _)))
| S.member nameStr trivialTypes = Nothing
| otherwise = Just name
maybeVarNameFromType (AppT f x) = maybeVarNameFromType f <|> maybeVarNameFromType x
maybeVarNameFromType _ = Nothing
varNameFromType :: Type -> Name
varNameFromType (ConT name) = name
varNameFromType (all @ (AppT f x)) = case maybeVarNameFromType all of
Just o -> o
Nothing -> varNameFromType x
varNameFromType others = error $ "varNameFromType " ++ show others ++ " not implemented"
varNamesForNormalSlots :: [StrictType] -> [String]
varNamesForNormalSlots slots =
let
dupNames :: S.Set String
dupNames = go S.empty S.empty rawNames where
go res seen (nameStr : others) =
if S.member nameStr seen then
go (S.insert nameStr res) seen others
else
go res (S.insert nameStr seen) others
go res _ _ = res
rawNames = map (\(_, ty) -> case varNameFromType ty of Name (OccName str) _ -> lowerHead str) slots
finalNames = go M.empty rawNames where
go nameCount (nameStr : others) =
if S.member nameStr dupNames then
let
nameCount' = M.insertWith (+) nameStr 1 nameCount
serial = nameCount' M.! nameStr
in
(nameStr ++ show serial) : go nameCount' others
else
nameStr : go nameCount others
go _ _ = []
in
finalNames
transExprFromType :: String -> Type -> String
transExprFromType funPrefix = genExpr where
genExpr = \case
ConT (Name (OccName name) _)
| S.member name trivialTypes -> "id"
| otherwise -> funPrefix ++ name
AppT ListT x -> "fmap (" ++ genExpr x ++ ")"
AppT (ConT (Name (OccName "Maybe") _)) x -> "fmap (" ++ genExpr x ++ ")"
AppT (AppT (TupleT 2) a) b -> "(" ++ genExpr a ++ ") *** (" ++ genExpr b ++ ")"
others -> error $ "exprFromType " ++ show others ++ " not implemented"
exprFromType :: String -> String -> Type -> String
exprFromType name funPrefix ty = "(" ++ transExprFromType funPrefix ty ++ " " ++ name ++ ")"
lowerHead :: String -> String
lowerHead (c:cs) = toLower c : cs
genDataTransformer :: String -> Info -> String
genDataTransformer funPrefix =
let
conToDef tyNameStr (NormalC (Name (OccName conNameStr) _) slots) =
let
varNames = varNamesForNormalSlots slots
in
funPrefix ++ tyNameStr ++
" (" ++ intercalate " " (conNameStr : varNames) ++ ") = " ++
intercalate " " (conNameStr : zipWith (\varNameStr (_, ty) -> exprFromType varNameStr funPrefix ty) varNames slots) ++
"\n"
in
\case
TyConI (DataD [] (Name (OccName tyNameStr) _) [] cons _) ->
let
typeSig = funPrefix ++ tyNameStr ++ " :: " ++ tyNameStr ++ " -> " ++ tyNameStr ++ "\n"
in
concat $ typeSig : map (conToDef tyNameStr) cons
others -> error $ "genDataTransformer " ++ show others ++ " not implemented"
讀到這裡,認為有機會靠 TH 寫出簡潔的 pretty print ,但還是不確定從何下手才好。
順其自然吧。