Build Your Own Haskell Compiler #2
為了讓大家可以簡單地寫 desugar function , C 使用 Template Haskell ,以便編譯時自動產生程式碼,輸出給 Haskell.Src.Exts 用的 desugar function template ,於是做 if 的 desugar 時,只要寫:
-- 注意:這是最早的版本,沒有使用帶更多資訊的 `Haskell.Src.Exts.Annotated`
deIfExp (If exp1 exp2 exp3) = deIfExp $
	Case exp1
  	[ Alt (SrcLoc "" 0 0) (PApp (UnQual (Ident "False")) []) (UnGuardedRhs exp3) Nothing
    , Alt (SrcLoc "" 0 0) (PApp (UnQual (Ident "True")) []) (UnGuardedRhs exp2) Nothing
    ]省掉手寫處理其他 Syntax 的三百多行。
但身為一個連 Monoid, Functor, Applicative 都還搞不清楚的笨蛋,怎麼可能一下子就懂 TH (Template Haskell)在搞什麼呢?
所以來拆解 7e37b79 吧!看看 TH 到底是怎麼幫助我們的 XD
文末附上的是只處理 Haskell.Src.Exts.Syntax.Module 的精簡版,沒有 7e37b79 那麼複雜,以下一點一點說明:
deriveDesugarTemplate
程式的進入點是 deriveDesugarTemplate :: String -> Q [Dec] ,吃的 String 是之後用來生 desugar template 的 function 名稱,吐出來的 Q [Dec] 在 $() 裡面會變成程式,在 Main.hs 中這樣寫:
-- 前面有些 code
$(deriveDesugarTemplate "xx")
-- 後面有些 code之後就有個叫 xx 的 function 可以用。寫 xx "DeIf" "deIf" 就會生出想要的三百多行程式。
deriveDesugarTemplate :: String -> Q [Dec]
deriveDesugarTemplate funName = do
  Just moduleName <- lookupTypeName "Module"
  moduleInfo <- reify moduleName
  runIO $ putStrLn $ show moduleInfo
  let code = genDataTransformer "deIf" moduleInfo
  runIO $ putStrLn $ code
  fmap pure $ funD (mkName funName) [clause (map (varP . mkName) ["modName", "funPrefix"]) (normalB (varE (mkName "undefined"))) []]一開始的:
Just moduleName <- lookupTypeName "Module"先從看得見的地方(在此例中是 Main.hs 和 DeriveDesugarTemplate import 的 modules),找出叫做 "Module" 的 Name ,再靠:
moduleInfo <- reify moduleName把 Module 的形狀拆出來。
runIO 則可以幫我們在 Q 中印出 moduleInfo ,看看接下來該怎麼寫。
Language.Haskell.Src.Exts.Syntax.Module
Module :
Module SrcLoc ModuleName [ModulePragma] (Maybe WarningText) (Maybe [ExportSpec]) [ImportDecl] [Decl]在 TH 中長這個樣子:
TyConI
  ( DataD
    [] -- Cxt
    Language.Haskell.Exts.Syntax.Module -- Name
    [] -- [TyVarBndr]
    [ NormalC -- [Con]
      Language.Haskell.Exts.Syntax.Module
      [ (NotStrict, ConT Language.Haskell.Exts.SrcLoc.SrcLoc) -- SrcLoc
      , (NotStrict, ConT Language.Haskell.Exts.Syntax.ModuleName) -- ModuleName
      , (NotStrict, AppT ListT (ConT Language.Haskell.Exts.Syntax.ModulePragma)) -- [ModulePragma]
      , (NotStrict, AppT (ConT GHC.Base.Maybe) (ConT Language.Haskell.Exts.Syntax.WarningText)) -- Maybe WarningText
      , (NotStrict, AppT (ConT GHC.Base.Maybe) (AppT ListT (ConT Language.Haskell.Exts.Syntax.ExportSpec))) -- Maybe [ExportSpec]
      , (NotStrict, AppT ListT (ConT Language.Haskell.Exts.Syntax.ImportDecl)) -- [ImportDecl]
      , (NotStrict, AppT ListT (ConT Language.Haskell.Exts.Syntax.Decl)) -- [Decl]
      ]
    ]
    [] -- [Name]
  )我們關心的,主要是 [Con] 中那些 Types , genDataTransformer 會幫忙產生處理 Module 的 function 。
genDataTransformer
genDataTransformer 吃新函數的 prefix 和 Module 在 TH 中的形狀。可以拆成兩部分看,先看後半用 LambdaCase 寫的部分:
\case
  TyConI (DataD [] (Name (OccName tyNameStr) _) [] cons _) ->
    let
      typeSig = funPrefix ++ tyNameStr ++ " :: " ++ tyNameStr ++ " -> " ++ tyNameStr ++ "\n"
    in
      concat $ typeSig : map (conToDef tyNameStr) cons
  others -> error $ "genDataTransformer " ++ show others ++ " not implemented"在 Module 的例子中, typeSig 會變成 "deIfModule :: Module -> Module\n" ,而下面的 map (conToDef tyNameStr) cons 生出來的就是 function body 。
conToDef tyNameStr (NormalC (Name (OccName conNameStr) _) slots) =
  let
    varNames = varNamesForNormalSlots slots
  in
    funPrefix ++ tyNameStr ++
      " (" ++ intercalate " " (conNameStr : varNames) ++ ") = " ++
      intercalate " " (conNameStr : zipWith (\varNameStr (_, ty) -> exprFromType varNameStr funPrefix ty) varNames slots) ++
      "\n"其中:
funPrefix ++ tyNameStr ++
  " (" ++ intercalate " " (conNameStr : varNames) ++ ")"會生出:
deIfModule (Module srcLoc moduleName modulePragma warningText exportSpec importDecl decl)也就是說, varNamesForNormalSlots 會把 slots 收集起來,變成可以 pattern match 的樣子。
intercalate 在這裡做的事情只是在 name 之間插入空白,再接成 String 。
intercalate " " (conNameStr : zipWith (\varNameStr (_, ty) -> exprFromType varNameStr funPrefix ty) varNames slots)會生出:
Module (deIfSrcLoc srcLoc) (deIfModuleName moduleName) (fmap (deIfModulePragma) modulePragma) (fmap (deIfWarningText) warningText) (fmap (fmap (deIfExportSpec)) exportSpec) (fmap (deIfImportDecl) importDecl) (fmap (deIfDecl) decl)exprFromType 生出了遞迴遍歷各個 slot 需要寫的 code 。
varNamesForNormalSlots
finalNames = go M.empty rawNames where
  go nameCount (nameStr : others) =
    if S.member nameStr dupNames then
      let
        nameCount' = M.insertWith (+) nameStr 1 nameCount
        serial = nameCount' M.! nameStr
      in
        (nameStr ++ show serial) : go nameCount' others
    else
      nameStr : go nameCount others
  go _ _ = []這段做的事情是,找出重複的名字,並在名字後面加上數字。這麼一來, Exp Exp Exp 就不會變成 exp exp exp ,而是可以區分彼此的 exp1 exp2 exp3 。先看現在的 nameStr 在不在放著重複名字的 dupNames 中,如果在的話,就用加上數字的名字(nameStr ++ show serial),不然用普通的名字就好(nameStr)。
dupNames :: S.Set String
dupNames = go S.empty S.empty rawNames where
  go res seen (nameStr : others) =
    if S.member nameStr seen then
      go (S.insert nameStr res) seen others
    else
      go res (S.insert nameStr seen) others
  go res _ _ = resdupNames 也很好玩,一開始有兩個空的 Set ,前面的用來放重複的名字,後面的用來放看過的名字。從 rawNames 中把名字一個一個拿出來看,如果這個名字已經看過了(S.member nameStr seen),就塞到 res 中;如果沒看過,就塞到 seen 中。
rawNames 又是怎麼來的?靠 varNameFromType 從 slots 裡面拆出來的:
rawNames = map (\(_, ty) -> case varNameFromType ty of Name (OccName str) _ -> lowerHead str) slots拆出來後把字首小寫(lowerHead)。
varNameFromType
varNameFromType :: Type -> Name
varNameFromType (ConT name) = name
varNameFromType (all @ (AppT f x)) = case maybeVarNameFromType all of
  Just o -> o
  Nothing -> varNameFromType x
varNameFromType others = error $ "varNameFromType " ++ show others ++ " not implemented"拆 Name 出來時,有意思的是那個 maybeVarNameFromType 。會濾掉包裝或基本的 Type :
trivialTypes :: S.Set String
trivialTypes = S.fromList ["String", "Maybe", "Int", "Rational", "Char", "Integer", "Bool"]做法是:
maybeVarNameFromType :: Type -> Maybe Name
maybevarNameFromType (ConT (name @ (Name (OccName nameStr) _)))
  | S.member nameStr trivialTypes = Nothing
  | otherwise = Just name
maybeVarNameFromType (AppT f x) = maybeVarNameFromType f <|> maybeVarNameFromType x
maybeVarNameFromType _ = Nothing這樣就不用特地把 Maybe WarnText 寫成 mWarnText ,只要寫成 warnText 就好。該如何正確地處理它們,則交給 exprFromType 。
exprFromType 和 transExprFromType
exprFromType :: String -> String -> Type -> String
exprFromType name funPrefix ty = "(" ++ transExprFromType funPrefix ty ++ " " ++ name ++ ")"exprFromType 套上了最外面的 () 與目標物(name)。重點在:
transExprFromType :: String -> Type -> String
transExprFromType funPrefix = genExpr where
  genExpr = \case
    ConT (Name (OccName name) _)
      | S.member name trivialTypes -> "id"
      | otherwise -> funPrefix ++ name
    AppT ListT x -> "fmap (" ++ genExpr x ++ ")"
    AppT (ConT (Name (OccName "Maybe") _)) x -> "fmap (" ++ genExpr x ++ ")"
    AppT (AppT (TupleT 2) a) b -> "(" ++ genExpr a ++ ") *** (" ++ genExpr b ++ ")"
    others -> error $ "exprFromType " ++ show others ++ " not implemented"這邊才真的看每個 Type 的形狀,決定該不該加 fmap ,該加幾層。
那個 *** 是 Control.Arrow 裡的東西,又提醒我,得把看到一半的文章看完,這樣才有機會搞懂 FRP 。
簡化過的 DerivieDesugarTemplate.hs 全文
module DeriveTemplate where
import Data.Functor
import Data.List
import Data.Char
import Data.Monoid
import Control.Applicative
import Control.Monad
import qualified Data.Set as S
import qualified Data.Map as M
import Language.Haskell.TH
import Language.Haskell.TH.Syntax
trivialTypes :: S.Set String
trivialTypes = S.fromList ["String", "Maybe", "Int", "Rational", "Char", "Integer", "Bool"]
deriveDesugarTemplate :: String -> Q [Dec]
deriveDesugarTemplate funName = do
  Just moduleName <- lookupTypeName "Module"
  moduleInfo <- reify moduleName
  runIO $ putStrLn $ show moduleInfo
  let code = genDataTransformer "deIf" moduleInfo
  runIO $ putStrLn $ code
  fmap pure $ funD (mkName funName) [clause (map (varP . mkName) ["modName", "funPrefix"]) (normalB (varE (mkName "undefined"))) []]
maybeVarNameFromType :: Type -> Maybe Name
maybevarNameFromType (ConT (name @ (Name (OccName nameStr) _)))
  | S.member nameStr trivialTypes = Nothing
  | otherwise = Just name
maybeVarNameFromType (AppT f x) = maybeVarNameFromType f <|> maybeVarNameFromType x
maybeVarNameFromType _ = Nothing
varNameFromType :: Type -> Name
varNameFromType (ConT name) = name
varNameFromType (all @ (AppT f x)) = case maybeVarNameFromType all of
  Just o -> o
  Nothing -> varNameFromType x
varNameFromType others = error $ "varNameFromType " ++ show others ++ " not implemented"
varNamesForNormalSlots :: [StrictType] -> [String]
varNamesForNormalSlots slots =
  let
    dupNames :: S.Set String
    dupNames = go S.empty S.empty rawNames where
      go res seen (nameStr : others) =
        if S.member nameStr seen then
          go (S.insert nameStr res) seen others
        else
          go res (S.insert nameStr seen) others
      go res _ _ = res
    rawNames = map (\(_, ty) -> case varNameFromType ty of Name (OccName str) _ -> lowerHead str) slots
    finalNames = go M.empty rawNames where
      go nameCount (nameStr : others) =
        if S.member nameStr dupNames then
          let
            nameCount' = M.insertWith (+) nameStr 1 nameCount
            serial = nameCount' M.! nameStr
          in
            (nameStr ++ show serial) : go nameCount' others
        else
          nameStr : go nameCount others
      go _ _ = []
  in
    finalNames
transExprFromType :: String -> Type -> String
transExprFromType funPrefix = genExpr where
  genExpr = \case
    ConT (Name (OccName name) _)
      | S.member name trivialTypes -> "id"
      | otherwise -> funPrefix ++ name
    AppT ListT x -> "fmap (" ++ genExpr x ++ ")"
    AppT (ConT (Name (OccName "Maybe") _)) x -> "fmap (" ++ genExpr x ++ ")"
    AppT (AppT (TupleT 2) a) b -> "(" ++ genExpr a ++ ") *** (" ++ genExpr b ++ ")"
    others -> error $ "exprFromType " ++ show others ++ " not implemented"
exprFromType :: String -> String -> Type -> String
exprFromType name funPrefix ty = "(" ++ transExprFromType funPrefix ty ++ " " ++ name ++ ")"
lowerHead :: String -> String
lowerHead (c:cs) = toLower c : cs
genDataTransformer :: String -> Info -> String
genDataTransformer funPrefix =
  let
    conToDef tyNameStr (NormalC (Name (OccName conNameStr) _) slots) =
      let
        varNames = varNamesForNormalSlots slots
      in
        funPrefix ++ tyNameStr ++
          " (" ++ intercalate " " (conNameStr : varNames) ++ ") = " ++
          intercalate " " (conNameStr : zipWith (\varNameStr (_, ty) -> exprFromType varNameStr funPrefix ty) varNames slots) ++
          "\n"
  in
    \case
      TyConI (DataD [] (Name (OccName tyNameStr) _) [] cons _) ->
        let
          typeSig = funPrefix ++ tyNameStr ++ " :: " ++ tyNameStr ++ " -> " ++ tyNameStr ++ "\n"
        in
          concat $ typeSig : map (conToDef tyNameStr) cons
      others -> error $ "genDataTransformer " ++ show others ++ " not implemented"讀到這裡,認為有機會靠 TH 寫出簡潔的 pretty print ,但還是不確定從何下手才好。
順其自然吧。