1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
|
#+date: <2025-02-25 Mon 19:20:05>
#+title: Navigating a Challenging Proton Mail to Migadu Email Migration
#+description: Detailed insights and step-by-step guide on migrating emails from Proton Mail to Migadu, tackling common challenges and solutions for a successful email migration process.
#+slug: email-migration
#+filetags: :email:migration:protonmail:
* The Setup
I recently migrated my emails from Proton Mail to Migadu after a failed attempt
to get myself into the Proton ecosystem and wanted to detail my process, as it
was far more painful than expected.
To give some context: I had nearly 5000 messages stored, accounting for around
2.5 GB of space.
Overall, this process would have taken all day had I done it in one sitting, but
I decided to break it up and it lasted a couple days before I was able to say that
all my messages were stored in my new account.
* Exporting Messages
To start, I needed to export my messages from Proton Mail. As I am using macOS,
I was able to use the [[https://proton.me/support/proton-mail-export-tool][Proton Mail Export Tool]]. However, the downside is that
this dumps every single email on your account into a single folder in the =.eml=
format. They also export a JSON file for each message, in the case you're
importing back into Proton Mail.
This means that anything in my Inbox, Sent, Archive, Trash, and user-created
folders were all dumped out into a single folder with incomprehensible names.
Without a clear path to easily figure out how to re-organize my emails into a
new account, I was left a bit annoyed at Proton's export process.
* Importing Messages
Left with a pile of messages and no way to discern what they were without
opening each one, I decided to try and use Thunderbird to import messages into
my new Migadu IMAP account.
This led to a dead end as my two methods failed:
1. [[https://addons.thunderbird.net/en-US/thunderbird/addon/importexporttools-ng/][ImportExportTools NG]] does not work with my version of Thunderbird (135).
2. Manually dragging the =.eml= files onto a folder in Thunderbird worked for
small batches of files, but seemed to lock up if I tried to import more than
a few hundred at a time. It also seemed a bit buggy, as I ended up with many
duplicate, and sometimes triplicate, messages.
At this point, I decided to take a step back and use [[https://github.com/djcb/mu][mu]], a command-line utility
that would index my files and sync back and forth with Migadu for me.
Using my blog post, [[https://cleberg.net/blog/mu4e.html][Email in Doom Emacs with Mu4e on macOS]], (and skipping the
mu4e parts) I was able to set up a minimal directory connected to my Migadu IMAP
account. Using my terminal, I simply moved all of my messages into the =mu=
directory and synchronized the account, and voila, my messages synchronized
successfully to the remote server and my other email clients.
However, the remaining issue was that I now had all 5000 messages in the Archive
folder and needed to figure out how to organize them back into their proper
directories.
* Organizing Messages Into Folders
As with any problem, I used Python as my hammer to fix the problem. I started by
creating the directories required in Thunderbird, fetching them with =mbsync= so
that they appeared in my =mu= directory, and using Python to organize my
messages into the newly-created sub-folders.
** Sent Messages
I started by organizing my Sent messages. This required checking each file for
the =From= header and moving them to the Sent folder.
#+begin_src shell
cd ~/.maildir/migadu/Archive/cur
nano _sent.py
#+end_src
#+begin_src python
# _sent.py
import os
import glob
import shutil
# Loop through all files in the current folder
for file in glob.glob("*.eml"):
# Create boolean to check if we should move the file
move = False
# Open the current file
f = open(file, 'r')
# For each line in file, find the From header
for line in f:
if line.startswith("From:"):
# If we find ourself, mark the message for move
if "user@example.com" in line:
move = True
# Close the file
f.close()
# Move the file, if marked for move
if move == True:
filepath = os.path.join("/Users/YOUR_USERNAME/.maildir/migadu/Archive/cur/", file)
new_filepath = os.path.join("/Users/YOUR_USERNAME/.maildir/migadu/Sent/cur/", file)
shutil.move(filepath, new_filepath)
#+end_src
#+begin_src python
python3 _sent.py
#+end_src
The only downside to my current approach is that it was the quick and dirty
option, so I re-ran it while editing the =user@example.com= string for each
email I wanted to move. If I had wanted to create a more well-defined solution,
I would have created an array of addresses to check for and have the =if=
statement check against that array.
Regardless, I was able to run this with the addresses I wanted to move to the
Sent folder and was soon finished.
** Archive Sub-Folders
Next, I needed to move the remaining ~3000 messages from the Archive folder into
dated sub-folders, organized as such:
- Archive/2016
- ...
- Archive/2025
To do this, I followed a similar approach as the method above but check for the
=Date= header instead of the =From= header.
#+begin_src shell
cd ~/.maildir/migadu/Archive/cur
nano _archive.py
#+end_src
This approach requires finding the =X-Pm-Date= header and splitting it by the
spaces contained within. Once split into a list, we must select the fourth
element, as that contains the year which will match the directory we should move
it to.
For example, the header =X-Pm-Date: Fri, 07 Feb 2025 16:12:08 +0000= will be
split into a list as such:
#+begin_src python
[
'X-Pm-Date:', # 0
'Fri,', # 1
'07', # 2
'Feb', # 3
'2025', # 4
'16:12:08', # 5
'+0000' # 6
]
#+end_src
From this list, we select the fourth element (=2025=) and use that to build the
destination path.
#+begin_src python
# _archive.py
import os
import glob
import shutil
# Loop through all files in the sub-folders under Archive
for file in glob.glob("*.eml"):
# Create boolean to check if we should move the file
move = False
# Open the current file
f = open(file, 'r')
# For each line in file, find the X-Pm-Date header
for line in f:
if line.startswith("X-Pm-Date"):
# Split the line into a list by spaces;
# Then select the item that contains the year
year = line.split(" ")[4]
move = True
# Close the file
f.close()
# Move the file, if marked for move
if move == True:
filepath = os.path.join("/Users/YOUR_USERNAME/.maildir/migadu/Archive/cur/", file)
new_filepath = os.path.join(f"/Users/YOUR_USERNAME/.maildir/migadu/Archive/{year}/cur/", file)
shutil.move(filepath, new_filepath)
#+end_src
#+begin_src python
python3 _archive.py
#+end_src
At this point, we've now moved all Sent messages to the Sent box and organized
all messages under the Archive folder into their correct sub-folders.
If you exported other files, such as files from your Inbox, Trash, etc., you
could follow a similar approach and determine the best header or attribute to
identify them for further organization.
** Synchronize the Results
Before synchronizing the files in their new locations, I needed to remove the
characters at the end of the file name since =mu= appends IDs to the end of file
names.
#+begin_src shell
cd ~/.maildir/migadu/Archive
nano _sync_prep.py
#+end_src
This script prepares the =Archive= sub-folders for synchronization, but the same
concept applies to the Sent folder, except you'd replace =*/cur/*= with =*= if
this script were inside the =Sent/cur= directory.
#+begin_src python
import glob
import shutil
# Loop through all files in the sub-folders under Archive
for file in glob.glob("*/cur/*"):
# Remove the characters at the end of the file name created by =mu=
new_file = file.split(",U=",1)[0]
# Move the file to the new file name
shutil.move(file, new_file)
#+end_src
#+begin_src shell
python3 _sync_prep.py
#+end_src
Finally, we can synchronize the results.
#+begin_src shell
mbsync -aV
#+end_src
* Removing Duplicates
My only remaining issue at the time of writing is identifying and removing
duplicate messages. I have toyed with simple Python and command-line solutions
to identify duplicate files, but could not get them to effectively define all
the duplicates found in any specific directory.
I've even tried using the [[https://github.com/pkolaczk/fclones][fclones]] utility, to no avail. It seems that something
in the Proton export, my manual Thunderbird method attempt, or possible sync
issues between Thunderbird -> Migadu <-> mu caused duplicates where content
within the message has been modified.
Although I now seem to be wasting space and in need of a deduplication tool, I
have all of my messages migrated to my new service.
|