caasih.net

Re: Build Your Own Haskell Compiler #0.2

parseModule

其實他很邪惡,要用到時才開始讀,雖然他叫 lazy ,但是跟 lazy evaluation 沒有關係。 m***m 很不喜歡。

先開個新檔案 sample/A.hs

module Main where

import Prelude ()

main = putStrLn "hello, world"

讀檔可以用 Prelude 中的 getContents ,它會還給你一個 IO String ,可以用在 main 的 do notation 裡面:

main = do
  inputStr <- getContents
  putStrLn inputStr

這樣子輸入一行就能看到程式輸出一行。

inputStr 餵給來自 Language.Haskell.Exts.Annotated 的 parseModule ,可以得到 ParseResult ,再以 case ... of 看看結果是 ParseOk 還是 ParseFailed ,從裡面拆出 AST :

main = do
  inputStr <- getContents
  let res = parseModule inputStr
  putStrLn $ case res of
    ParseOk mod -> show mod
    ParseFailed _ msg -> msg

Language.Haskell.Exts.Annotated 吐出來的 AST 包含了描述程式碼位置的 SrcSpanInfo ,直接印出來大概像:

Module (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 1 1 6 1, srcInfoPoints = [SrcSpan "<unknown>.hs" 1 1 1 1,SrcSpan "<unknown>.hs" 1 1 1 1,SrcSpan "<unknown>.hs" 3 1 3 1,SrcSpan "<unknown>.hs" 5 1 5 1,SrcSpan "<unknown>.hs" 6 1 6 1,SrcSpan "<unknown>.hs" 6 1 6 1]}) (Just (ModuleHead (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 1 1 1 18, srcInfoPoints = [SrcSpan "<unknown>.hs" 1 1 1 7,SrcSpan "<unknown>.hs" 1 13 1 18]}) (ModuleName (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 1 8 1 12, srcInfoPoints = []}) "Main") Nothing Nothing)) [] [ImportDecl {importAnn = SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 3 1 3 18, srcInfoPoints = [SrcSpan "<unknown>.hs" 3 1 3 7]}, importModule = ModuleName (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 3 8 3 15, srcInfoPoints = []}) "Prelude", importQualified = False, importSrc = False, importSafe = False, importPkg = Nothing, importAs = Nothing, importSpecs = Just (ImportSpecList (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 3 16 3 18, srcInfoPoints = [SrcSpan "<unknown>.hs" 3 16 3 17,SrcSpan "<unknown>.hs" 3 17 3 18]}) False [])}] [PatBind (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 1 5 31, srcInfoPoints = []}) (PVar (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 1 5 5, srcInfoPoints = []}) (Ident (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 1 5 5, srcInfoPoints = []}) "main")) (UnGuardedRhs (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 6 5 31, srcInfoPoints = [SrcSpan "<unknown>.hs" 5 6 5 7]}) (App (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 8 5 31, srcInfoPoints = []}) (Var (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 8 5 16, srcInfoPoints = []}) (UnQual (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 8 5 16, srcInfoPoints = []}) (Ident (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 8 5 16, srcInfoPoints = []}) "putStrLn"))) (Lit (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 17 5 31, srcInfoPoints = []}) (String (SrcSpanInfo {srcInfoSpan = SrcSpan "<unknown>.hs" 5 17 5 31, srcInfoPoints = []}) "hello, world" "hello, world")))) Nothing]

如果覺得自己在操作 AST 時,會被 SrcSpanInfo 干擾,可以加上 fmap (const ())

main = do
  inputStr <- getContents
  let res = parseModule inputStr
  putStrLn $ case res of
    ParseOk mod -> show (fmap (const ()) mod)
    ParseFailed _ msg -> msg

fmapconst () 作用在整個 AST 上(前面提到的 annamap 則作用在單一個 node 上),把全部的 SrcSpanInfo 清光光,變成:

Module () (Just (ModuleHead () (ModuleName () "Main") Nothing Nothing)) [] [ImportDecl {importAnn = (), importModule = ModuleName () "Prelude", importQualified = False, importSrc = False, importSafe = False, importPkg = Nothing, importAs = Nothing, importSpecs = Just (ImportSpecList () False [])}] [PatBind () (PVar () (Ident () "main")) (UnGuardedRhs () (App () (Var () (UnQual () (Ident () "putStrLn"))) (Lit () (String () "hello, world" "hello, world")))) Nothing]

在還沒有真正的 pretty print 前,手動排版後,看起來是:

Module
  ()
  (Just (ModuleHead () (ModuleName () "Main") Nothing Nothing))
  []
  [ ImportDecl
      { importAnn = ()
      , importModule = ModuleName () "Prelude"
      , importQualified = False
      , importSrc = False
      , importSafe = False
      , importPkg = Nothing
      , importAs = Nothing
      , importSpecs = Just (ImportSpecList () False [])
      }
  ]
  [ PatBind
      ()
      (PVar () (Ident () "main"))
      (UnGuardedRhs
        ()
        (App
          ()
          (Var () (UnQual () (Ident () "putStrLn")))
          (Lit () (String () "hello, world" "hello, world"))))
      Nothing
  ]

對我來說,在習慣查文件前,可以對整個 AST 長什麼樣子,有個概念。

Isaac Huang 發佈於