Skip to content

Conversation

@antgonza
Copy link
Member

No description provided.

Copy link

@sjanssen2 sjanssen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally like the idea and the cleanup of the existing test. I would love to see instructions on what the user can do with the information provided in the MANIFEST, i.e. how to obtain these files.

Comment on lines +9 to +12
- `summary.html`: a browser friendly file listing that will include all files at `[artifact-id]/[output-folder]` and
any `index.html` files in any subfolder. As a reminder, the Qiita nginx basic configuration allows to display/load any
html/JS available files; thus, able to display properly `index.html` files available
- `MANIFEST.txt`: a comprehensive list of all available files in the folder.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the wording, it is not 100% clear to me if

  • "will include all files at [artifact-id]/[output-folder]" includes recursion through all sub-directories
  • "Qiita nginx basic configuration allows to display/load any html/JS available files" which lines in the example configuration refers to this? Can you point to those?
  • which class of user can make use of the MANIFEST.txt. Can he/she download the whole directory as a ZIP archive?

The two main plugins using this output are:

- https://github.com/qiita-spots/qp-knight-lab-processing: which will generate an `[output-folder]` contaning all the logs,
files and summaries from BCL to clean FASTQ processing. Note that multiqc resoults are part of this and the outputs are

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add which user class can run this command? I figure it is only admins but no normal users, right?

Comment on lines 9 to 15
from unittest import main
from tempfile import mkdtemp
from os import remove
from os.path import exists, isdir, join, dirname, abspath
from inspect import currentframe, getfile
from shutil import rmtree, copytree
from json import dumps
from os import remove
from os.path import abspath, dirname, exists, isdir, join
from shutil import copytree, rmtree
from tempfile import mkdtemp
from unittest import main

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, your editor seems to have flipped ordering of imports. Makes it harder to see actual differences :-/

Comment on lines 73 to 75
self._clean_up_files.extend([ff["filepath"]])
for f in res["files"].values()
for ff in f

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my intuition says that shorter variable names are inner elements. Here f is a list and ff a file-dict.
With three lines, we have more space, therefore I suggest to change f into files and ff into file

Comment on lines 81 to 87
print("-------------")
print("-------------")
print(html)
print("-------------")
print(EXP_HTML.format(aid=aid))
print("-------------")
print("-------------")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover from debugging?

separator = "|--"
for dpath, _, files in walk(folder):
# assuring same order, mainly for testing
files.sort()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this inplace sorting?

index.append(("file", f"{dpath}/{f}"))
# if we are not at the top, we should only add
# the index.html files
elif "index.html" in files:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have to take care for different case spelling, e.g. a file with name Index.html?

elif "index.html" in files:
index.append(("file", f"{dpath}/index.html"))

depth = dpath.replace(folder, "").count(sep)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is folder relative or absolute? What if the path contains the folder infix twice like job-output/4/job-output/4/subdir?

with open(manifest_fp, "w") as of:
of.write("\n".join(manifest))

links = [link % (manifest_fp[tlink:], "file", manifest_fp[tname:])]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would https://www.geeksforgeeks.org/python/python-os-path-relpath-method/ be a safer solution? Instead of performing string operations here?

Comment on lines 58 to 62
for ft, f in index:
# to avoid any duplication of lines:
_link = link % (f[tlink:], ft, f[tname:])
if _link not in links:
links.append(_link)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't a sorted(list(set())) do?

Copy link

@sjanssen2 sjanssen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When looking closer at the currently failing test, I see your debug info:

-------------
<a href="./10/test_data/MANIFEST.txt" type="file" target="_blank">test_data/MANIFEST.txt</a><br/>
<a href="./10/test_data/file_1" type="file" target="_blank">test_data/file_1</a><br/>
<a href="./10/test_data/file_2" type="file" target="_blank">test_data/file_2</a><br/>
<a href="./10/test_data/folder_a/folder_b/index.html" type="file" target="_blank">test_data/folder_a/folder_b/index.html</a><br/>
<a href="./10/test_data/folder_1/index.html" type="file" target="_blank">test_data/folder_1/index.html</a><br/>
<a href="./10/test_data/test_data/folder_a/folder_b/index.html" type="file" target="_blank">test_data/test_data/folder_a/folder_b/index.html</a><br/>
<a href="./10/test_data/test_data/folder_1/index.html" type="file" target="_blank">test_data/test_data/folder_1/index.html</a>
-------------
<a href="./10/test_data/MANIFEST.txt" type="file" target="_blank">test_data/MANIFEST.txt</a><br/>
<a href="./10/test_data/file_1" type="file" target="_blank">test_data/file_1</a><br/>
<a href="./10/test_data/file_2" type="file" target="_blank">test_data/file_2</a><br/>
<a href="./10/test_data/folder_a/folder_b/index.html" type="file" target="_blank">test_data/folder_a/folder_b/index.html</a><br/>
<a href="./10/test_data/folder_1/index.html" type="file" target="_blank">test_data/folder_1/index.html</a>
-------------

and assume the above is the observation, below is expectation.

Why do you observe ./10/test_data/test_data/folder_1/index.html? The test_data part should not be repeated?! As the provided test_data does NOT contain itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants