External storage, enhancements, bugfixes#377
Conversation
|
Generally speaking, seems reasonable to me - thoughts:
|
|
|
||
| # serialize object | ||
| blob = pack(obj) | ||
| hash = long_hash(blob) + store[len('external-'):] |
There was a problem hiding this comment.
store[len('external-'):] seems incongruent w/r/t line 88: store = 'external' + ('-' if store else '') + store which could conceivably allow only external; that said using only 'external' as a specifier is already blocked in declare.py so I'm not sure how important this is.. perhaps line 88 should throw an error rather than allowing 'external' only?
There was a problem hiding this comment.
The idea is to allow either external or external-somethin where somethin cannot exceed 8 characters. external by itself should not be blocked. If it is, it's my mistake and I will fix it and add a test for it. Many people will only have one external storage and they will just use external.
There was a problem hiding this comment.
Why is store[len('external-'):] incongruent? You can do a[n:] where n exceeds the size of a no problem.
There was a problem hiding this comment.
Ah ok - I think on first read I was taking this to mean there was an explicit expectation that the slice would return something to append to the hash, which was then "confirmed" when I saw that declare.py requires the external- convention (via the 'isidentifier' check on the secondary portion) and throws exceptions suggesting this is a requirement.
Not allowing the bare 'external' in some cases (here, declare.py) but allowing it in another (line 88) is why I proposed one was incongruent with respect to the other.
There was a problem hiding this comment.
if 'external' alone is is allowed, it looks like the store_name parts from declare (line 191+) need to be altered -
>>> storestr='external'
>>> storestr.split('-')
['external']
>>> store_name = storestr.split('-')
>>> store_name[1:]
[]
>>> ''.join(store_name[1:])
''
>>> ''.join(store_name[1:]).isidentifier()
False
There was a problem hiding this comment.
I see. It should be isidentifier or empty.
|
Concerning preparation for #204 - (other commits in that branch relate to other parts of S3 integration) currently defined tests are passing locally. If you agree with this approach, it might make sense to get that mechanism sorted out and merged in to your branch with this PR; alternately can tack it on afterwords. Either way good to get the discussion started. |
datajoint/relational_operand.py
Outdated
| return arg, _negate | ||
| elif isinstance(arg, AndList): | ||
| return '(' + ' AND '.join([make_condition(element)[0] for element in arg]) + ')', _negate | ||
| elif arg is True or arg is False: |
There was a problem hiding this comment.
Much prefer isinstance(arg, bool)
datajoint/declare.py
Outdated
| raise DataJointError('External attributes cannot be primary in:\n%s' % line) | ||
| store_name = match['type'].split('-') | ||
| if store_name[0] != 'external': | ||
| raise DataJointError('External store types must be in format external-<name>') |
There was a problem hiding this comment.
Should say in format external vs external-<name>
datajoint/declare.py
Outdated
| if store_name[0] != 'external': | ||
| raise DataJointError('External store types must be in format external-<name>') | ||
| store_name = '-'.join(store_name[1:]) | ||
| if not store_name.isidentifier(): |
There was a problem hiding this comment.
This will reject external - needs to be updated to not (store_name == ' ' or storename.isidentifier())
datajoint/relational_operand.py
Outdated
| "Cannot restrict in place a projection with renamed attributes." | ||
| if isinstance(restriction, AndList): | ||
| self.restrictions.extend(restriction) | ||
| :param as_or: if True than the applied restriction becomes or-listed with already existing conditions |
There was a problem hiding this comment.
This is documented but not present - either implement it or update the doc.
datajoint/relational_operand.py
Outdated
| return self & Not(restriction) | ||
|
|
||
| def restrict(self, restriction): | ||
| def restrict(self, arg): |
There was a problem hiding this comment.
I liked it better to call it restriction rather than arbitrary arg as the parameter name.
| from . schema_external import schema | ||
|
|
||
|
|
||
| def test_external_put(): |
There was a problem hiding this comment.
need to add test case for pure external in addition to external-xxx formatted configs
tests/test_foreign_keys.py
Outdated
| @@ -1,5 +1,6 @@ | |||
| from nose.tools import assert_equal, assert_false, assert_true | |||
| from datajoint.declare import declare | |||
| from datajoint import AndList, OrList | |||
There was a problem hiding this comment.
why importing AndList and OrList here? Appears to be unused.
| logger.info('Found %d keys to populate' % len(keys)) | ||
| for key in keys: | ||
|
|
||
| make = self._make_tuples if hasattr(self, '_make_tuples') else self.make |
There was a problem hiding this comment.
We should throw deprecation warning when we find _make_tuples
There was a problem hiding this comment.
Concluded that we are holding off deprecation of _make_tuples
… scaling problems
Here are the issues currently worked on. The issues that are fixed or in progress will be included in this pull request. The queued issues may be included.
makereplace_make_tuples? #387 - fixedskip_duplicates#388 - fixed