Added Java Streaming I/O for FASTA Files#1080
Conversation
…of the URL path to reflect a change in the v2 naming.
josemduarte
left a comment
There was a problem hiding this comment.
Very nice feature, thanks! A couple of comments below. Apologies for the slow response.
| InputStream input; | ||
| try { | ||
| input = provider.getInputStream(getPath().toFile()); | ||
| } catch (IOException exception) { |
There was a problem hiding this comment.
Wouldn't it be better to throw the exception so that it can be handled at a higher level?
There was a problem hiding this comment.
Or if stream() doesn't permit a throws in the signature, then I'd advice to use an UncheckedIOException instead of RuntimeException
There was a problem hiding this comment.
The exceptions would break stream chaining, as the contrived example below shows. I changed to UncheckedIOException. I originally had the exception being thrown, so it is allowed by Java, but it caused me problems when I used it in my code, so I get where you are coming from.
String file = this.getClass().getResource("PF00104_small.fasta.gz").getFile();
String file2 = this.getClass().getResource("PF00105_small.fasta.gz").getFile();
List<Path> paths = List.of(Paths.get(file), Paths.get(file2));
List<ProteinSequence> sequences = paths
.stream()
.flatMap(path -> FastaStreamer.from(path).stream()) // <-- Not allowed if exception
.collect(Collectors.toList());| * for use in a functional programming paradigm. | ||
| * | ||
| * @author Gary Murphy | ||
| * @since 7.0.3 |
There was a problem hiding this comment.
I think we should bump up to 7.1.0 since this introduces a (very nice) new feature
There was a problem hiding this comment.
I changed to 7.1.0-SNAPSHOT
| closed = true; | ||
| reader.close(); | ||
| } catch (IOException exception) { | ||
| throw new RuntimeException(String.format("I/O error reading the FASTA file from '%s'", getPath())); |
There was a problem hiding this comment.
Changed to UncheckedIOException
|
|
||
| // Download locations | ||
| public static final String SCOP_DOWNLOAD = "http://scop.berkeley.edu/downloads/parse/"; | ||
| public static final String SCOP_DOWNLOAD_ALTERNATE = "http://scop.berkeley.edu/downloads/parse/"; |
There was a problem hiding this comment.
It looks like these changes were from #1077 . I've merged that one now. Could you merge master in to have a cleaner diff?
There was a problem hiding this comment.
Dang. I thought I was. This is my first time dealing with a forked repo. I guess I will use the webapp instead of CLI. (Sigh)
| * @param sequence the protein sequence | ||
| * @return the sequence | ||
| */ | ||
| protected ProteinSequence createSequence(String header, ProteinSequence sequence) { |
There was a problem hiding this comment.
The header parameter is not used, is that intentional?
There was a problem hiding this comment.
Good catch. The was in there before I found getOriginalHeader.
|
I am not sure how to fix the diff issue. I did the |
|
Simply merge the latest master into your branch: you do that locally in your computer, then push to your fork. Then github will automagically pick it up and you will see an updated PR. |
josemduarte
left a comment
There was a problem hiding this comment.
Thanks for the changes. One final request from me: could you add a new 7.1.0 - future release section in the changelog and describe this new feature briefly?
|
This looks pretty cool -- may I ask is it primarily about new syntax/language features, or are there some performance considerations? If the caller is simply going to |
josemduarte
left a comment
There was a problem hiding this comment.
Thanks again! I'll try to create a new release in the next days
|
@heuermh It is more to do with liking in the streams programming style. I do all of my alignment testing, homologies, etc. using streams-based processing. I haven't done any benchmarking, but I would expect it to be pretty close to imperative loops. |
Added a class to use Java streams to process FASTA files.