python で coding

戻る

Pythonでメールを読み込みhtmlを作る

前回pythonでhtmlを作るプログラムを作成したが、inputとなる添付画像と本文は手作業でパソコンにメールからダウンロードする。

時間にすれば数分だが、せっかくなのでこの作業もプログラムにするのは出来ないだろうかと思った。

outlookメールにpythonでアプローチできれば応用範囲が広がる。

調べると確かにpythonを使ってoutlookのメール処理の事例やガイドが沢山載っている。copy&pasteで作ってみてもどうもうまく行かない。それぞれ環境が微妙に異なるか、上級者が初歩的なところを省略しているものもあるようだ。

1日かがりで何とか出来たので掲載しておく。

その後メール読み込みとhtml作成をまとめて1つのpythonにした。

win32com(Microsoft extention)モジュールのインストール　outlookやexcelの操作が出来る
参考):MicroSoft Doc.

プログラム仕様など

当日の日付で目的のoutlookフォルダーの内容を読み込む
添付画像をフォルダーにダウンロードする
SaveASFileで書き込むが絶対パスが必要らしい。(絶対パス C:zzz など)
メール本文を変数として保持する
html文では、タイトル、写真名、コメント及び本文(太字)　以外は定型文なので別ファイルからコピーする。
headの title(タグ)とbodyの見出しはそれぞれダミーから正規のものと入れ替える
本文の段落をhtmlの段落タグ<p> . . . . </p>で囲む。
画像の名前は数字ではなく文字列で読み込まれるので 2より10の方が先に来るので処理に少々てこずったがnatsorted()で解決。
画像がportrait(縦長)かlandscape(横長)かにより表示の大きさを変える
稀にpythonでは文字コードutf-8に変換する際にエラーになることがある。回避する方法があったが面倒なので止めた。手作業で処理。

python code


# -----------------------------------------------------------------------------------------------------------------
# html generater for nagano art
# 1). read outlook mail 
#     input :
#         長野美術館 folder in outlook   
#     output :
#         text_in.txt : text file - first line must be title like 美術館訪問記 - 442 ルーヴル・ランス美術館 (ref only)
#         img_in foder : attached file for save (添付 画像)
#
# 2). build html 
#     input :
#           img_in      : image file folder in same directry
#           text/text_1.txt : html code top 
#           text/text_2.txt : html code bottom
#      output :
#           nagano_art_' + title_n + '.html'
# -----------------------------------------------------------------------------------------------------------------
# import module
import os      # os interface
import win32com.client   # windows extention 
import datetime  # date time 
import shutil  # file 操作
import re       # get number 
import cv2     # get img size
from natsort import natsorted  # natural sort 
# ------------------------------------------------------------------------------
# 1).  read outlook mail
# ------------------------------------------------------------------------------
# ----- get today ymd
dt_now = datetime.datetime.now()
dt = dt_now.strftime('%Y-%m-%d') # 2020-07-10 
# dt = '2020-07-10'        
print('date: ', dt)
# os.system('PAUSE')
# outlook file 
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
root_folder = outlook.Folders.Item(3)  # target account root folder in outlook same as mail address
print('root folder:' , root_folder)
# target folder name
target_folder = root_folder.Folders['長野美術館']
# --- attached (添付)  file save folder path 
img_in = 'img_in' # image file
img_in_path = os.getcwd() + '\\' + img_in   # get img パス 
if (os.path.exists(img_in_path)):   # delete if exists
     shutil.rmtree(img_in_path)
os.mkdir(img_in_path )                # make direcrty(folder)
# --- mail text path
text_out = 'text_in.txt'
# --- get mail item 
mails =target_folder.Items     # get mail item
print(type(mails))  # <class 'win32com.client.CDispatch'>
for mail in mails:
     print("-----------------")
     print("件名: " ,mail.subject)
     print('差出人: ' ,mail.sendername)
     print("受信日時: ", mail.receivedtime)
     mail_date = str(mail.receivedtime)
#    print("本文: ", mail.body)
# ----------------------------------------------------------------
#    get mail received today  
#    save attached file
#    write text into file (ref only)
# -----------------------------------------------------------------
     if dt in mail_date:                # today's  mail ?
          with open(text_out , 'w' , encoding = 'utf-8') as f_w:
               wlines = mail.subject + '\n' 
               f_w.write(wlines)        # 1'st line subject 
          with open(text_out , 'a' , encoding = 'utf-8' , newline = '' ) as f_w:
               wlines = mail.body
               f_w.write(wlines)
          # ---- get attached file and save 
          attachments = mail.Attachments      # 添付files 
          if mail.attachments.count != 0:     # 添付　あり
               for attachment in mail.attachments:
                   img_save = img_in_path + '/' + str(attachment)
                   attachment.SaveASFile(img_save)
# ------------------------------------------------------------------------------
# 2).build html 
# ------------------------------------------------------------------------------
# --- input img file
img_in_folder = 'img_in'
# --- title name default
title_in = '美術館訪問記 - xxx　yyyy'
# --- predifined text top and bottom
textp_in_1 = 'text/text_1.txt'
textp_in_2 = 'text/text_2.txt'
# --- output html folder    
path_html = 'html'
# --- name of out html 
name_outh_html = 'nagano_art_'
# --- html tag for src img 
tag_1 = '<div id="photo2">'
tag_1_2 = '<div id="photo1">'      # small wide if portrait
tag_1_3 = '<div id="photo3">'      # more landscape 
tag_2 = '<a href="javascript:;" onclick="OPEN1(\''
tag_3 = '\')">'
tag_4 = '<img src="'
tag_5 = '"></img></a><p>'
tag_6 = '"></img><p>'
tag_7 = '</p></div>'
#--- img file out folder in html 
o_file = 'img12' 
# ------------------------------------------------------------------------
# get title from mail subject 
# ------------------------------------------------------------------------
title_in = mail.subject.replace('\u3000',' ').replace('-','- ')    # 全角スペース -> 半角   美術館訪問記 - 443 リール美術館
title_n = re.sub("\\D", "", title_in )    # 数字を取り出す 訪問記no 443
path_html = path_html + '/' + name_outh_html + title_n + '.html'   # folder , html name
# ------------------------------------------------------------------------
#   copy top html line 
# ------------------------------------------------------------------------
with open(textp_in_1, 'r', encoding = 'utf-8') as f_text , open(path_html, 'w', encoding = 'utf-8') as f_w:
     for rline in f_text:
          if '<title>' in rline:
               rline = '<title>' + title_in + '</title>' + '\n'
          if '<h2>' in rline:
               rline = '<h2><span id="spand">' + title_in + '</span></h2>' + '\n'
          wline =rline
          f_w.write(wline)
          print(wline)
# ------------------------------------------------------------------------
#   read img file name from img folder , sort by name 
# ------------------------------------------------------------------------
img_files_ins = os.listdir(img_in_path)
img_files = natsorted(img_files_ins)
print(img_files)                 # img_files = [2020 7 3 1.jpg , 2020 7 3 2.jpg , ........]   
# ------------------------------------------------------------------------
#   read text file  from mail.body 本文読み込み
# ------------------------------------------------------------------------
line_ins = mail.body.splitlines()  # list 
line_ins = [i.strip() for i in line_ins]   # remove new line code
# ------------------------------------------------------------------------
#   get 添付 n :  ex) 添付1：美術館.... , 添付2：       　
# ------------------------------------------------------------------------
line_tempus = [p for p in line_ins if '添付' in p and '：' in p ]
# ------------------------------------------------------------------------
#  add 改行 after '作'   , ブランク削除
# ------------------------------------------------------------------------
line_tempus = [p.replace('作','作<br>') . replace('　',' ') for p in line_tempus]
# ------------------------------------------------------------------------
#  write img tag with 添付: コメント in html
# <div id="photo1"><a href="javascript:;" onclick="OPEN1('img12/2020 7 3 1.jpg')">
#    <img src="img12/2020 7 3 1.jpg"></img></a><p>添付1：美術館までの無料バス</p></div>
#    check portrait (縦長)  or Landscape(横長)
# ------------------------------------------------------------------------
with open(path_html , 'a' , encoding = 'utf-8') as f_w:
     i = 1
     for i_name in img_files:
          pf = img_in_folder + '/' + i_name   # get img path name   # img12/2020 7 3 1.jpg
          print(pf)
          im = cv2.imread(pf)
          im_pf = im.shape               # get hight , width , color 
          tag_1_0 = tag_1                # default 
          if im_pf[0] > im_pf[1] :      # portrait ? 　
             tag_1_0 = tag_1_2          # small width
          if im_pf[1] / im_pf[0] > 1.5 :
             tag_1_0 = tag_1_3           # more landscape  
          wline = '\n' + tag_1_0 + tag_2 + o_file + '/' + i_name + tag_3 + tag_4 + o_file + '/' + i_name + tag_5 + line_tempus[i-1] + tag_7
          f_w.write(wline)
          print(wline)
          i +=  1
# ------------------------------------------------------------------------
# ---- read text and write html with insert <p>...</p> 行替え
# ------------------------------------------------------------------------
with open(path_html , 'a' , encoding = 'utf-8') as f_w:
     in_text = '\n' + '</div>' + '\n' + '<div id="divtext">' + '\n' + '<p>'   # text start
     f_w.write(in_text)
     i = 1
     for in_text in line_ins:
          if '添付' in in_text and '：' in in_text:     # end 
                  break
          if i == 1 and '美術館訪問記' in in_text :   # discard first line 
                   continue
          if in_text == '' : 
             in_text = '\n' + '</p>' + '\n' + in_text + '<p>'
          f_w.write(in_text)
          i += 1
     in_text = '</p>' + '\n' + '</div>' + '\n'      # text last 
     f_w.write(in_text)
# ------------------------------------------------------------------------
# ---- copy html part2 , change next html name 
# ------------------------------------------------------------------------
with open(textp_in_2, 'r' , encoding = 'utf-8') as f_text , open(path_html , 'a' , encoding = 'utf-8') as f_w:
     t = int(title_n)  + 1
     title_n = str(t)
     for wline in f_text:
          if '<div id="divnextpage">' in wline:
              wline = '<div id="divnextpage"><a href="nagano_art_' + title_n + '.html"><p>美術館訪問記　No.' + title_n +  'はこちら</p></a></div>' + '\n'
          f_w.write(wline)
print('-------end --------')
#--- end

outlook mail 操作メモ

ターゲットのfolderまでたどり付くのに手間取ったのでメモしておく


outlookのfolder　構造

　

　abc@def.ne.jp  　　root folder　

　　　削除済みアイテム

　　　受信トレイ

　　　送信トレイ

　　　送信済みアイテム

　　　予定表

　　　連絡先

　　　........

　　　長野美術館   (目的のfolder)

　　　........

　ghi@jkl.ne.jp  　　root folder

　　　削除済みアイテム

　　　受信トレイ

　　　........

　mno@pqr.ne.jp  　　root folder

　　　削除済みアイテム

　　　受信トレイ

　　　........

python code (参照:MailItem プロパティ (Outlook))


# outlook 
import win32com.client
import os
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
root_folder = outlook.Folders.Item(3)  　3つのアカウントの順番は不明なので 1 から記入してみた
print('root folder:',root_folder)
print(len(root_folder.Folders))
folder_name = '長野美術館'
for i in root_folder.Folders:
    print(i)
    if folder_name in str(i):
        target_folder = root_folder.Folders[folder_name]
        mails =target_folder.Items
        print(type(mails))  # 
        for mail in mails:
             print('-----------------')
             print('件名: ' ,mail.subject)
             print('送り主: ',mail.sendername)
             print('受信日時: ', mail.receivedtime)
             print('本文: ', mail.body)
        break

上記コードの出力(太字は追加コメント)


root folder: abc@def.ne.jp - outlook
20  (folder数)
削除済みアイテム
受信トレイ
送信トレイ
送信済みアイテム
予定表
連絡先
履歴
メモ
タスク
下書き
RSS フィード
スレッド アクション設定
クイック操作設定
迷惑メール
長野美術館  (目的のfolder)
<class 'win32com.client.CDispatch'>
-----------------
件名:  美術館訪問記 - 441 美術館訪問記 -441　長崎県美術館
送り主:  長野
受信日時:  2020-06-26 08:08:29+00:00
本文:  	(省略)
-----------------
件名:  美術館訪問記 - 442 ルーヴル・ランス美術館
送り主:  長野
受信日時:  2020-07-03 08:05:03+00:00
本文:  	(省略)
-----------------
.................................
>>>

戻る

Total	Today	Yesterday

Pythonでメールを読み込みhtmlを作る

プログラム仕様など

python code

outlook mail 操作 メモ

outlook mail 操作メモ