Discussion:
[DRMAA-WG] Conference call - Apr 27th - 19:00 UTC
Peter Tröger
2011-04-26 22:57:41 UTC
Permalink
Dear all,

the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".

Preliminary meeting agenda:

1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)

Sorry, I didn't had the time to prepare a new draft.

Best regards,
Peter.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: drmaav2_draft3_annotated.pdf
Type: application/pdf
Size: 619308 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/drmaa-wg/attachments/20110427/16d428bc/attachment-0001.pdf
Peter Tröger
2011-04-27 21:46:53 UTC
Permalink
Participants: Daniel, Mariusz, Roger, Andre (SAGA), Peter

Quick check of last weeks decisions, all agreed

Line 530 - checkpointability attribute in job template
- Grid Engine expresses checkpointability as string reference to
checkpointing environment
- would be boolean flag in Condor, indicating standard universe
- From SAGA perspective, no real use case
- Decision: Dropped

Line 578 - optional eMail attribute
- accepted by group

Line 609 - Staging support
- reformulate to allow submission and execution host being the same machine
- denote support for 'hierarchical copying' as implementation-specific
- reformulate to state that with parallel jobs, copy must target at
least the master node, and may also copy the files to other hosts
- clarify relationship between job working directory and relative paths

Line 707 - Reaction on reaching soft / hard limits
- Grid Engine: Signal depends on particular limit type
- Agreement that crossing a hard limit should lead to FAILED state of
the DRMAA job
- Agreement to remove softResourceLimits completely, since DRMAA cannot
promise any kind of common semantics, and since the attribute is not
important enough to add it as opaque concept (as with slots)

Section 9.2.4 / 9.2.7:
- reservedSlots should be mandatory information, reservedMachines should
be optional information

Agreement to specify possible error codes per method after some
implementations were done

Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
- handy concept supported by some, but not all DRM systems
- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system

Best regards,
Peter.
Post by Peter Tröger
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
Daniel Gruber
2011-04-28 07:02:30 UTC
Permalink
There are currently two problems from Grid Engine:
- There seems no way for getting the desired NOW behavior (at least in this section the *optional* NOW keyword is not defined)
for an GE specific enhancement, without breaking compatibility
- In GE there is no currently no sliding windows support for the SET/SET/SET in case of duration is shorter than endTime-startTime
(GE DRMAA implementation have then a similar problem then the other DRM which do not support NOW as startTime)

Following suggestions for this section (5.6.2):

- Add the *optional* "NOW" constant -> if an implementation does not
support it, it is treated like UNSET (InvalidAttributeException)
- If startTime, endTime and duration is set and duration is shorter than endTime-startTime, the
sliding windows approach (take "the earliest point in time") could made optional.
That means: take startTime and duration or *optionally* search the earliest point in
time.

I know we discussed it more than once, but having these options would make it
much easier to get a compatible implementation.

Cheers,

Daniel
Post by Peter Tröger
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
- handy concept supported by some, but not all DRM systems
- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Post by Peter Tröger
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. This message has been content scanned by the Univa Mail system.



---------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20110428/dac0ba5d/attachment.html
Peter Tröger
2011-04-29 12:14:10 UTC
Permalink
Post by Daniel Gruber
- There seems no way for getting the desired NOW behavior (at least in this section the *optional* NOW keyword is not defined)
for an GE specific enhancement, without breaking compatibility
- In GE there is no currently no sliding windows support for the SET/SET/SET in case of duration is shorter than endTime-startTime
(GE DRMAA implementation have then a similar problem then the other DRM which do not support NOW as startTime)
- Add the *optional* "NOW" constant -> if an implementation does not
support it, it is treated like UNSET (InvalidAttributeException)
My understanding of the agreed result was a little but more radical. NOW is not supported by all DRM systems, and it is not as crucial as slots ;-), so we can just leave it out. Applications then will start to build their own "NOW" workarounds (current local time plus ... hmmm .... 10s), which is completely fine in this specific case.
Post by Daniel Gruber
- If startTime, endTime and duration is set and duration is shorter than endTime-startTime, the
sliding windows approach (take "the earliest point in time") could made optional.
That means: take startTime and duration or *optionally* search the earliest point in
time.
I don't understand this. What is the alternative for searching the earliest feasible startTime ? Ignoring the duration value ? Or ignoring end time ?

Best regards,
Peter.
Post by Daniel Gruber
I know we discussed it more than once, but having these options would make it
much easier to get a compatible implementation.
Cheers,
Daniel
Post by Peter Tröger
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
- handy concept supported by some, but not all DRM systems
- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Post by Peter Tröger
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. This message has been content scanned by the Univa Mail system.
---------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20110429/1e408bf1/attachment.html
Daniel Gruber
2011-04-29 12:27:55 UTC
Permalink
Post by Peter Tröger
Post by Daniel Gruber
- There seems no way for getting the desired NOW behavior (at least in this section the *optional* NOW keyword is not defined)
for an GE specific enhancement, without breaking compatibility
- In GE there is no currently no sliding windows support for the SET/SET/SET in case of duration is shorter than endTime-startTime
(GE DRMAA implementation have then a similar problem then the other DRM which do not support NOW as startTime)
- Add the *optional* "NOW" constant -> if an implementation does not
support it, it is treated like UNSET (InvalidAttributeException)
My understanding of the agreed result was a little but more radical. NOW is not supported by all DRM systems, and it is not as crucial as slots ;-), so we can just leave it out. Applications then will start to build their own "NOW" workarounds (current local time plus ... hmmm .... 10s), which is completely fine in this specific case.
We should really take an optional NOW constant really into account. Application could do their own workaround but
*if* a DRM has build-in support for this, it is really hard to offer this functionality. It just prevents that this can be optionally
implemented. What do we loose with an new optional "NOW" constant??
Post by Peter Tröger
Post by Daniel Gruber
- If startTime, endTime and duration is set and duration is shorter than endTime-startTime, the
sliding windows approach (take "the earliest point in time") could made optional.
That means: take startTime and duration or *optionally* search the earliest point in
time.
I don't understand this. What is the alternative for searching the earliest feasible startTime ? Ignoring the duration value ? Or ignoring end time ?
One of them, for me it really does not matter which is going to be ignored, it should just be defined.
Maybe the best solution would be ignore "end time" when start "time + duration" <= "end time".

Cheers,

Daniel
Post by Peter Tröger
Best regards,
Peter.
Post by Daniel Gruber
I know we discussed it more than once, but having these options would make it
much easier to get a compatible implementation.
Cheers,
Daniel
Post by Peter Tröger
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
- handy concept supported by some, but not all DRM systems
- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Post by Peter Tröger
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. This message has been content scanned by the Univa Mail system.
---------------------------------------------------------------------
---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. This message has been content scanned by the Univa Mail system.



---------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20110429/f23ed3f7/attachment-0001.html
Peter Tröger
2011-04-29 12:39:24 UTC
Permalink
Post by Daniel Gruber
Post by Peter Tröger
Post by Daniel Gruber
- There seems no way for getting the desired NOW behavior (at least in this section the *optional* NOW keyword is not defined)
for an GE specific enhancement, without breaking compatibility
- In GE there is no currently no sliding windows support for the SET/SET/SET in case of duration is shorter than endTime-startTime
(GE DRMAA implementation have then a similar problem then the other DRM which do not support NOW as startTime)
- Add the *optional* "NOW" constant -> if an implementation does not
support it, it is treated like UNSET (InvalidAttributeException)
My understanding of the agreed result was a little but more radical. NOW is not supported by all DRM systems, and it is not as crucial as slots ;-), so we can just leave it out. Applications then will start to build their own "NOW" workarounds (current local time plus ... hmmm .... 10s), which is completely fine in this specific case.
We should really take an optional NOW constant really into account. Application could do their own workaround but
*if* a DRM has build-in support for this, it is really hard to offer this functionality. It just prevents that this can be optionally
implemented. What do we loose with an new optional "NOW" constant??
Every single optional attribute is weakening the standard - at the end, we all want to achieve portable applications without a lot of if's and when's. We will perform a majority decision about this on the next conf call.
Post by Daniel Gruber
Post by Peter Tröger
Post by Daniel Gruber
- If startTime, endTime and duration is set and duration is shorter than endTime-startTime, the
sliding windows approach (take "the earliest point in time") could made optional.
That means: take startTime and duration or *optionally* search the earliest point in
time.
I don't understand this. What is the alternative for searching the earliest feasible startTime ? Ignoring the duration value ? Or ignoring end time ?
One of them, for me it really does not matter which is going to be ignored, it should just be defined.
Maybe the best solution would be ignore "end time" when start "time + duration" <= "end time".
Ok, if there is no further objection, I will add this one.

Best,
Peter.
Post by Daniel Gruber
Cheers,
Daniel
Post by Peter Tröger
Best regards,
Peter.
Post by Daniel Gruber
I know we discussed it more than once, but having these options would make it
much easier to get a compatible implementation.
Cheers,
Daniel
Post by Peter Tröger
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
- handy concept supported by some, but not all DRM systems
- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Post by Peter Tröger
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. This message has been content scanned by the Univa Mail system.
---------------------------------------------------------------------
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. This message has been content scanned by the Univa Mail system.
---------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20110429/7593f256/attachment.html
Mariusz Mamoński
2011-04-29 12:39:20 UTC
Permalink
Post by Daniel Gruber
- There seems no way for getting the desired NOW behavior (at least in this
section the *optional* NOW keyword is not defined)
??for an GE specific enhancement, without breaking compatibility
- ?In GE there is no currently no sliding windows support for the
SET/SET/SET in case of duration is shorter than endTime-startTime
?? (GE DRMAA implementation have then a similar problem then the other DRM
which do not support NOW as startTime)
- Add the *optional* "NOW" constant -> if an implementation does not
??support it, it is treated like UNSET (InvalidAttributeException)
My understanding of the agreed result was a little but more radical. NOW is
not supported by all DRM systems, and it is not as crucial as slots ;-), so
we can just leave it out. Applications then will start to build their own
"NOW" workarounds (current local time plus ... ?hmmm .... 10s), which is
completely fine in this specific case.
We should really take an optional NOW constant really into account.
Application could do their own workaround but
*if* a DRM has build-in support for this, it is really hard to offer this
functionality. It just prevents that this can be optionally
implemented. What do we loose with an new optional "NOW" constant??
for me it is ok, as far as we can introspect if NOW is supported by
the given DRM system.
Post by Daniel Gruber
- If startTime, endTime and duration is set and duration is shorter than
endTime-startTime, the
??sliding windows approach (take "the earliest point in time") could made
optional.
??That means: take startTime and duration or *optionally* search the
earliest point in
?? time.
I don't understand this. What is the alternative for searching the earliest
feasible startTime ? Ignoring the duration value ? Or ignoring end time ?
One of them, for me it really does not matter which is going to be ignored,
it should just be defined.
Maybe the best solution would be ignore "end time" when start "time +
duration" <= "end time".
Cheers,
Daniel
the same problem. In my initial proposal:

http://fury.man.poznan.pl/~mmamonski/wiki/index.php/DRMAAv2/Advance_Reservation

"duration
Reservation duration. If reservation duration is shorter than endTime
- startTime the earliest reservation (matching the requirements, e.g.:
slotsCount) will be created. If this attribute is omitted then the
duration is assumed to be equal to endTime - startTime. Optional
attribute."


i wanted the "duration" to be optional. Now i remember why ;-) This
was an easy way to determine if the DRM system support searching
earliest feasible reservation in the given time window. I.E. if the
system support the Duration attribute it means that it also offer the
aforementioned functionality.
Post by Daniel Gruber
Best regards,
Peter.
I know we discussed it more than once, but having these options would make it
much easier to get a compatible implementation.
Cheers,
Daniel
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
??- handy concept supported by some, but not all DRM systems
??- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
??drmaa-wg mailing list
??drmaa-wg at ogf.org
??http://www.ogf.org/mailman/listinfo/drmaa-wg
--
?drmaa-wg mailing list
?drmaa-wg at ogf.org
?http://www.ogf.org/mailman/listinfo/drmaa-wg
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message. This message has been content scanned by the Univa
Mail system.
---------------------------------------------------------------------
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message. This message has been content scanned by the Univa
Mail system.
---------------------------------------------------------------------
--
?drmaa-wg mailing list
?drmaa-wg at ogf.org
?http://www.ogf.org/mailman/listinfo/drmaa-wg
--
Mariusz
Peter Tröger
2011-04-29 12:55:12 UTC
Permalink
Post by Mariusz Mamoński
Post by Daniel Gruber
One of them, for me it really does not matter which is going to be ignored,
it should just be defined.
Maybe the best solution would be ignore "end time" when start "time +
duration" <= "end time".
Cheers,
Daniel
http://fury.man.poznan.pl/~mmamonski/wiki/index.php/DRMAAv2/Advance_Reservation
"duration
Reservation duration. If reservation duration is shorter than endTime
slotsCount) will be created. If this attribute is omitted then the
duration is assumed to be equal to endTime - startTime. Optional
attribute."
i wanted the "duration" to be optional. Now i remember why ;-) This
was an easy way to determine if the DRM system support searching
earliest feasible reservation in the given time window. I.E. if the
system support the Duration attribute it means that it also offer the
aforementioned functionality.
Sounds very reasonable. I like this one.

Best,
Peter.
Post by Mariusz Mamoński
Post by Daniel Gruber
Best regards,
Peter.
I know we discussed it more than once, but having these options would make it
much easier to get a compatible implementation.
Cheers,
Daniel
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
- handy concept supported by some, but not all DRM systems
- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message. This message has been content scanned by the Univa
Mail system.
---------------------------------------------------------------------
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message. This message has been content scanned by the Univa
Mail system.
---------------------------------------------------------------------
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
Mariusz
Daniel Gruber
2011-04-29 12:56:18 UTC
Permalink
Post by Peter Tröger
Post by Mariusz Mamoński
Post by Daniel Gruber
One of them, for me it really does not matter which is going to be ignored,
it should just be defined.
Maybe the best solution would be ignore "end time" when start "time +
duration" <= "end time".
Cheers,
Daniel
http://fury.man.poznan.pl/~mmamonski/wiki/index.php/DRMAAv2/Advance_Reservation
"duration
Reservation duration. If reservation duration is shorter than endTime
slotsCount) will be created. If this attribute is omitted then the
duration is assumed to be equal to endTime - startTime. Optional
attribute."
i wanted the "duration" to be optional. Now i remember why ;-) This
was an easy way to determine if the DRM system support searching
earliest feasible reservation in the given time window. I.E. if the
system support the Duration attribute it means that it also offer the
aforementioned functionality.
Sounds very reasonable. I like this one.
me too

Daniel
Post by Peter Tröger
Best,
Peter.
Post by Mariusz Mamoński
Post by Daniel Gruber
Best regards,
Peter.
I know we discussed it more than once, but having these options would make it
much easier to get a compatible implementation.
Cheers,
Daniel
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
- handy concept supported by some, but not all DRM systems
- Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message. This message has been content scanned by the Univa
Mail system.
---------------------------------------------------------------------
---------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message. This message has been content scanned by the Univa
Mail system.
---------------------------------------------------------------------
--
drmaa-wg mailing list
drmaa-wg at ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
Mariusz
---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. This message has been content scanned by the Univa Mail system.



---------------------------------------------------------------------
Mariusz Mamoński
2011-05-01 21:44:44 UTC
Permalink
Post by Peter Tröger
Participants: Daniel, Mariusz, Roger, Andre (SAGA), Peter
Quick check of last weeks decisions, all agreed
Line 530 - checkpointability attribute in job template
- Grid Engine expresses checkpointability as string reference to
checkpointing environment
- would be boolean flag in Condor, indicating standard universe
- From SAGA perspective, no real use case
- Decision: Dropped
Line 578 - optional eMail attribute
- accepted by group
Line 609 - Staging support
- reformulate to allow submission and execution host being the same machine
- denote support for 'hierarchical copying' as implementation-specific
- reformulate to state that with parallel jobs, copy must target at
least the master node, and may also copy the files to other hosts
- clarify relationship between job working directory and relative paths
Line 707 - Reaction on reaching soft / hard limits
- Grid Engine: Signal depends on particular limit type
- Agreement that crossing a hard limit should lead to FAILED state of
the DRMAA job
- Agreement to remove softResourceLimits completely, since DRMAA cannot
promise any kind of common semantics, and since the attribute is not
important enough to add it as opaque concept (as with slots)
i promised to do some research, so:

we are mixing different resources wich limits have different purpose
and thus associated policy:

enum ResourceLimitType { CORE_FILE_SIZE , CPU_TIME , DATA_SEG_SIZE
, FILE_SIZE , OPEN_FILES , STACK_SIZE , VIRTUAL_MEMORY
, WALLCLOCK_TIME };

lets take the first one:

CORE_FILE_SIZE and Grid Engine

man queue_conf: " The remaining parameters in the queue configuration
template specify per job soft and hard resource limits as implemented
by the setrlimit(2) ..."

man setrlimit " RLIMIT_CORE Maximum size of core file. When 0 no core
dump files are created. When non-zero, larger dumps are truncated to
this size."

and the difference between Soft and Hard limit is defined as follows:
" The hard limit acts as a ceiling for the soft limit: an
unprivileged process may only set its soft limit to a value in the
range from 0 up to the hard limit, and (irreversibly) lower its hard
limit."

exceeding other limits like OPEN_FILES would result just in errors on
calls like open() which application can handle end exits with 0.

So the agreement that "crossing a hard limit should lead to FAILED"
should be valid only to some of the limits e.g.: WALLCLOCK_TIME,
CPU_TIME.
Post by Peter Tröger
- reservedSlots should be mandatory information, reservedMachines should
be optional information
Agreement to specify possible error codes per method after some
implementations were done
Line 751 - Reservation without time frame
- Makes no sense, since it might be way too short for the user
-> raise invalid argument exception on UNSET/UNSET/UNSET
- add rationale why startTime=UNSET is not equal to startTime=NOW
? - handy concept supported by some, but not all DRM systems
? - Emulation in the DRMAA library is not a valid option, since this
would lead to situations were the reservation already arrives 'too late'
in the DRM system
Best regards,
Peter.
Post by Peter Tröger
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet
on Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
? ?drmaa-wg mailing list
? ?drmaa-wg at ogf.org
? ?http://www.ogf.org/mailman/listinfo/drmaa-wg
--
?drmaa-wg mailing list
?drmaa-wg at ogf.org
?http://www.ogf.org/mailman/listinfo/drmaa-wg
--
Mariusz
Peter Tröger
2011-05-02 06:42:25 UTC
Permalink
Hi,
Post by Mariusz Mamoński
Post by Peter Tröger
Participants: Daniel, Mariusz, Roger, Andre (SAGA), Peter
Line 707 - Reaction on reaching soft / hard limits
- Grid Engine: Signal depends on particular limit type
- Agreement that crossing a hard limit should lead to FAILED state of
the DRMAA job
- Agreement to remove softResourceLimits completely, since DRMAA cannot
promise any kind of common semantics, and since the attribute is not
important enough to add it as opaque concept (as with slots)
we are mixing different resources wich limits have different purpose
enum ResourceLimitType { CORE_FILE_SIZE , CPU_TIME , DATA_SEG_SIZE
, FILE_SIZE , OPEN_FILES , STACK_SIZE , VIRTUAL_MEMORY
, WALLCLOCK_TIME };
CORE_FILE_SIZE and Grid Engine
man queue_conf: " The remaining parameters in the queue configuration
template specify per job soft and hard resource limits as implemented
by the setrlimit(2) ..."
man setrlimit " RLIMIT_CORE Maximum size of core file. When 0 no core
dump files are created. When non-zero, larger dumps are truncated to
this size."
" The hard limit acts as a ceiling for the soft limit: an
unprivileged process may only set its soft limit to a value in the
range from 0 up to the hard limit, and (irreversibly) lower its hard
limit."
exceeding other limits like OPEN_FILES would result just in errors on
calls like open() which application can handle end exits with 0.
So the agreement that "crossing a hard limit should lead to FAILED"
should be valid only to some of the limits e.g.: WALLCLOCK_TIME,
CPU_TIME.
That's an issue. I see basically three options here:

1) We define the hard limit violation behavior per parameter. In this case, we could add the soft limits again with the same approach.
2) We declare the job termination as MAY happen at any time after violation, and stick with leaving out the soft limits.
3) We drop resource limits completely.

Number 1 is most explicit (== good), but demands careful research on operating system level. Number 2 is our usual safe net. Number 3 is as explicit as number 1, but people may miss the feature.And no, doint it the 'slots' way is not an option ;-) ...

Best regards,
Peter.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20110502/3a8f6357/attachment.html
Mariusz Mamoński
2011-04-30 18:47:18 UTC
Permalink
Hi,

I finally managed to read the current version of spec more carefully.
Bellow some comments (line numbering corresponds to version annotated
as "draft3"):

line 81: DRMAA1 -> DRMAA Version 1 [reference]
94,95: A Exec.. -> An Exec
159: advanced -> advance
296: "Machine structure" - should we include machine state here (e.g.
down, administratively down, available, busy, ...) ?
316: consistent... -> consistent among all Machine struct instances.
Moreover any reported name should be a syntactically correct input for
the candidateMachines attribute of the JobTemplate. ???
361: any jobSubState - is there really any case where this would a
complex object? Why just not use string here (Yes i know, in the spec
there is a requirement that language binding should define conversion
to String for every object, but this may be complex... ;-)
370: missing \n
377-383: running, buffered, purged -> i think this sections needs to
be more precisely and verbose. In DRMAA 1.0 the wait call was
responsible for reaping the jobs. This is important because some DRMS
do not "buffer" jobs at all (or do it for a very short time) and the
buffering has to be done in the DRMAA library (for the session's jobs
only), this implies the question: how long to buffer the job
information...
395: exitStatus - should we state here that the valid exitStatus
values are 0-125 ?
445: cpuTime - should we state here that it is cumulative time among
all the job processes? i.e. cpu time can be grater than wall clock
time for parallel jobs
497: maybe we should add "Dictionary consumableResources;" @see Nadev
e-mail I also raised this during one of the last telcos...
594: "execution host" -> "submission host" ???
652: maxSlots should be optional (e.g. Torque do not support range values)
657: SHOULD -> MAY - at least until we don't have predefined JobCategories ;-)
785: SessionManagementException - what is the added value of this
exception? can it be thrown from other operations than
open/close/destroy Session? If not then why we don't have
WaitException, RunException? ;-)
791: OutOfMemoryException - can we also throw this exception when the
user supplied buffer was to small?
829: reservationSupported - maybe we can move it now to
DrmaaReflective interface?
948: FAILED vs DONE - maybe we should be more precisely for situation
when the job was started but: e.g. exited with exitcode != 0 (i
believe this should be DONE), was signalled, terminated via DRMAA,
967: REQUEUED, REQUEUED_HELD and BES states. Because BES state model
prohibits transition between the Running to Pending... so it it should
be Running state. Also the state names in brackets looks like
specialization of one of the BES implementations (i will not say which
implementation ;-) so they are definitively non-normative.
1035: The largest valid value for endIndex MUST be defined by the
language binding. - there may be also DRMS constraint.
1047: "only one of the active thread..." - is this requirement really
needed? i'm asking because i'm afraid this would increase complexity
of the implementations (do you remember the "session any" and its
coincidence with run job operations?). This may be related to comment
377-383.
1063: "DrmaaCallback Interface"....

I just wonder if the requirement "An implementation SHOULD also
disallow any library calls while the callback function is running, to
avoid recursion scenarios. It is RECOMMENDED to raise
TryLaterException in this case." is really needed. If we want to keep
this requirement is the Job object useful at all as we can only read
the jobId from it?

1109-1110: why those methods returns the Job objects?

1262: footnote 30 (what about symmetry ;-) Also last decision was to
have separate ReservationInfo struct:
http://www.mail-archive.com/drmaa-wg at ogf.org/msg00250.html (when it
was revoked?)

1508: reservationInfoOpt, reservationInfoImpl - what if one want to
provide more information about the reservation?, also the symmetry
rule ;-), relates to 1262
should we also move the drmsJobCategoryNames here (from
MonitoringSession)?


sorry for not waiting for the newest version but i wanted to finish it
before i will go for holidays (i will not be able to join the next
telco)


All the best,
Post by Peter Tröger
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet on
Skype, please find me under my user name "potsdam_pit".
1. Meeting secretary for this meeting?
2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting
from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards,
Peter.
--
?drmaa-wg mailing list
?drmaa-wg at ogf.org
?http://www.ogf.org/mailman/listinfo/drmaa-wg
--
Mariusz
Peter Tröger
2011-05-11 14:58:14 UTC
Permalink
Hi,

thanks for your huge contribution with this. Here are my comments. If the OGF.ORG SVN works again, I will update the document sources on the server. The new PDF follows today.

There are three kinds of reactions I state:

(1) Added. No discussion needed, I just performed the according document modification.
(2) Ignored. I am pretty confident that this was debated and decided well enough in the group, so I am not willing to re-open discussion again. The group is free to disagree.
(3) Obsoleted. Recent document modifications already established the proposal as a fact.

Best regards,
Peter.
Post by Mariusz Mamoński
Hi,
I finally managed to read the current version of spec more carefully.
Bellow some comments (line numbering corresponds to version annotated
line 81: DRMAA1 -> DRMAA Version 1 [reference]
94,95: A Exec.. -> An Exec
159: advanced -> advance
296: "Machine structure" - should we include machine state here (e.g.
down, administratively down, available, busy, ...) ?
316: consistent... -> consistent among all Machine struct instances.
Added.
Post by Mariusz Mamoński
Moreover any reported name should be a syntactically correct input for
the candidateMachines attribute of the JobTemplate. ???
canddateMachines takes MachineList as input.
Post by Mariusz Mamoński
361: any jobSubState - is there really any case where this would a
complex object? Why just not use string here (Yes i know, in the spec
there is a requirement that language binding should define conversion
to String for every object, but this may be complex... ;-)
Ignored, this was already discussed and decided.
Post by Mariusz Mamoński
370: missing \n
Added.
Post by Mariusz Mamoński
377-383: running, buffered, purged -> i think this sections needs to
be more precisely and verbose. In DRMAA 1.0 the wait call was
responsible for reaping the jobs. This is important because some DRMS
do not "buffer" jobs at all (or do it for a very short time) and the
buffering has to be done in the DRMAA library (for the session's jobs
only), this implies the question: how long to buffer the job
information...
Added as ToDo.
Post by Mariusz Mamoński
395: exitStatus - should we state here that the valid exitStatus
values are 0-125 ?
Ignored, this was already discussed and decided.
Post by Mariusz Mamoński
445: cpuTime - should we state here that it is cumulative time among
all the job processes? i.e. cpu time can be grater than wall clock
time for parallel jobs
Added, also for wall clock time.
Post by Mariusz Mamoński
e-mail I also raised this during one of the last telcos...
See meeting minutes.
Post by Mariusz Mamoński
594: "execution host" -> "submission host" ???
Why this ? inputPath and friends relate to files that are used by the running job on the execution host.
Post by Mariusz Mamoński
652: maxSlots should be optional (e.g. Torque do not support range values)
Added as ToDo.
Post by Mariusz Mamoński
657: SHOULD -> MAY - at least until we don't have predefined JobCategories ;-)
Ignored, this was already discussed and decided.
Post by Mariusz Mamoński
785: SessionManagementException - what is the added value of this
exception? can it be thrown from other operations than
open/close/destroy Session? If not then why we don't have
WaitException, RunException? ;-)
Added as ToDo.
Post by Mariusz Mamoński
791: OutOfMemoryException - can we also throw this exception when the
user supplied buffer was to small?
Added.
Post by Mariusz Mamoński
829: reservationSupported - maybe we can move it now to
DrmaaReflective interface?
Obsolete.
Post by Mariusz Mamoński
948: FAILED vs DONE - maybe we should be more precisely for situation
when the job was started but: e.g. exited with exitcode != 0 (i
believe this should be DONE), was signalled, terminated via DRMAA,
Ignored, this was already discussed and decided.
Post by Mariusz Mamoński
967: REQUEUED, REQUEUED_HELD and BES states. Because BES state model
prohibits transition between the Running to Pending... so it it should
be Running state. Also the state names in brackets looks like
specialization of one of the BES implementations (i will not say which
implementation ;-) so they are definitively non-normative.
Added. And yes, this is why the table title contains "example"
Post by Mariusz Mamoński
1035: The largest valid value for endIndex MUST be defined by the
language binding. - there may be also DRMS constraint.
Added.
Post by Mariusz Mamoński
1047: "only one of the active thread..." - is this requirement really
needed? i'm asking because i'm afraid this would increase complexity
of the implementations (do you remember the "session any" and its
coincidence with run job operations?). This may be related to comment
377-383.
Ignored, this was already discussed and decided.
Post by Mariusz Mamoński
1063: "DrmaaCallback Interface"....
I just wonder if the requirement "An implementation SHOULD also
disallow any library calls while the callback function is running, to
avoid recursion scenarios. It is RECOMMENDED to raise
TryLaterException in this case." is really needed. If we want to keep
this requirement is the Job object useful at all as we can only read
the jobId from it?
Added as ToDo.
Post by Mariusz Mamoński
1109-1110: why those methods returns the Job objects?
Ignored, this was already discussed and decided.
Post by Mariusz Mamoński
1262: footnote 30 (what about symmetry ;-) Also last decision was to
http://www.mail-archive.com/drmaa-wg at ogf.org/msg00250.html (when it
was revoked?)
Obsoleted.
Post by Mariusz Mamoński
1508: reservationInfoOpt, reservationInfoImpl - what if one want to
provide more information about the reservation?, also the symmetry
rule ;-), relates to 1262
Obsoleted.
Post by Mariusz Mamoński
should we also move the drmsJobCategoryNames here (from
MonitoringSession)?
No, since DrmaaReflective is only about introspection support for optional / impl. specific attributes. Added ToDo to clarify if this should move to the new generic capability check.
Loading...