STATUS - serf - Git at Google

 APACHE COMMONS: serf                                    -*-indented-text-*-

 OPEN ISSUES

   * rather than having serf_bucket_alloc_t, can we make
     apr_allocator_t work "well" for bunches o' small allocations?
     Justin says: Take the 'freelist' code from apr-util's bucket allocator
                  and merge that (somehow) with apr_allocator_t.
                  (< MIN_ALLOC_SIZE allocations - i.e. <8k)
     A: Sander says that the cost-per-alloc'd-byte is too
        expensive. the apr_allocator works best for 4k multiples. it
        provides a buffer between system alloc and the app. small
        allocs should be built as another layer.

   * memory usage probably needs some thought. in particular, the mix
     between the bucket allocators and any pools associated with the
     connection and the responses. see the point below about buckets
     needing pools.
     Justin says: Oh, geez, this is awful.  Really awful.  pools are just
                  going to be abused.  For now, I've tied a pool with the
                  serf_bucket_allocator as I can't think of a better
                  short-term solution.  I think we can revisit this once we
                  have a better idea how the buckets are going to operate.  I'd
                  like to get a better idea how filters are going to be written
                  before deciding upon our pool usage.
     gstein says: I think an allocator will be used for for a whole
                  context's worth of buckets. any pool within this
                  system ought to be transaction-scoped (e.g. tied to a
                  particular request/response pair).

   * the current definition has a "metadata" concept on a per-bucket
     basis. however, to record this metadata, we really want to be able
     to use apr_hash_t, which implies having a pool.

     we do have a respool associated with the response. how does that
     get transferred to the individual buckets for their usage? what
     about the request side of things?
     Justin says: Yes, the request side will need a pool, too.  Since we're
                  using APR, almost all of APR is closed off if we don't have
                  a pool.  So, we do need to allow for buckets to have access to
                  some pool.  The $64mil question is how.
     gstein says: each bucket is alloc'd from a serf_bucket_alloc_t
                  which has an associated pool. the bucket can fetch
                  that and use it.
                  Q: is this going to be workable?

   * How does serf_aggregate_bucket_become(bucket) work?
     Justin says: We want bucket to be an aggregate bucket, and for the first
                  bucket in that aggregate bucket to be the 'original' bucket.
                  I think we just swap internal data structures, but not sure.
     gstein says: we don't necessarily want to preserve the original
                  bucket. consider a REQUEST bucket which could
                  "become" an aggregate containing two buckets: one for
                  the serialized request-line and headers, and one for
                  the body. there is no REQUEST bucket any more.

   * If anyone knows how to use APR_RING, we could probably use that for the
     aggregate bucket list.
     gstein says: I think this implies a little entry structure which
                  contains the "link", plus a pointer to the bucket. I
                  was thinking of just using an apr array.

   * How does something like deflate get inserted into both the request and
     response?
     Justin says: I'm still not crystal clear on how this should work.
                  I'd like the concept of a 'filter' somehow as we can
                  shield the 'user programmer' from this, but I dunno.
                  A 'hook' after the request is created, but before it
                  is sent???
     gstein says: when assembling the request bucket, a "deflate"
                  bucket can be wrapped around the body bucket. it
                  reads raw content, compresses it, and returns
                  that. when a response arrives, an "inflate" bucket is
                  wrapped around the body bucket to inflate the
                  response. (of course, the decision points are made
                  based upon headers and server capabilities whatnot)

                  filters are just buckets that wrap other buckets.

   * how to signify "bucket is done"? len==0? NULL data pointer? return
     APR_EOF? Note that we intend for the response handler to return
     APR_EOF to denote that the response has been completely read. I'm
     of a mind to say "return APR_EOF".

     A: return APR_EOF. data may also be returned at the same
        time. this allows a bucket to return "all available
        information" which includes data and the notation that no more
        data exists.

   * if the poll() does not return POLLHUP, then how do we detect EOF?
     we'd end up seeing POLLIN, go for a read(), and probably get a
     SIGHUP or SIGPIPE as a result. if it just keeps returning len==0,
     then we could have a problem. part of the issue here is that we
     get a POLLIN from poll(), but the reading occurs in an entirely
     different context (call a response handler, which reads a bucket,
     which trickles down to a socket bucket, and then reading the
     socket object). transmitting "there should be something to read"
     over to that other context could be difficult.

     see: http://www.greenend.org.uk/rjk/2001/06/poll.html

   * review the various OUT parameters and see if any make sense to
     allow a NULL value to designate non-interest in the result.
	APACHE COMMONS: serf --indented-text--

	OPEN ISSUES

	* rather than having serf_bucket_alloc_t, can we make
	apr_allocator_t work "well" for bunches o' small allocations?
	Justin says: Take the 'freelist' code from apr-util's bucket allocator
	and merge that (somehow) with apr_allocator_t.
	(< MIN_ALLOC_SIZE allocations - i.e. <8k)
	A: Sander says that the cost-per-alloc'd-byte is too
	expensive. the apr_allocator works best for 4k multiples. it
	provides a buffer between system alloc and the app. small
	allocs should be built as another layer.

	* memory usage probably needs some thought. in particular, the mix
	between the bucket allocators and any pools associated with the
	connection and the responses. see the point below about buckets
	needing pools.
	Justin says: Oh, geez, this is awful. Really awful. pools are just
	going to be abused. For now, I've tied a pool with the
	serf_bucket_allocator as I can't think of a better
	short-term solution. I think we can revisit this once we
	have a better idea how the buckets are going to operate. I'd
	like to get a better idea how filters are going to be written
	before deciding upon our pool usage.
	gstein says: I think an allocator will be used for for a whole
	context's worth of buckets. any pool within this
	system ought to be transaction-scoped (e.g. tied to a
	particular request/response pair).

	* the current definition has a "metadata" concept on a per-bucket
	basis. however, to record this metadata, we really want to be able
	to use apr_hash_t, which implies having a pool.

	we do have a respool associated with the response. how does that
	get transferred to the individual buckets for their usage? what
	about the request side of things?
	Justin says: Yes, the request side will need a pool, too. Since we're
	using APR, almost all of APR is closed off if we don't have
	a pool. So, we do need to allow for buckets to have access to
	some pool. The $64mil question is how.
	gstein says: each bucket is alloc'd from a serf_bucket_alloc_t
	which has an associated pool. the bucket can fetch
	that and use it.
	Q: is this going to be workable?

	* How does serf_aggregate_bucket_become(bucket) work?
	Justin says: We want bucket to be an aggregate bucket, and for the first
	bucket in that aggregate bucket to be the 'original' bucket.
	I think we just swap internal data structures, but not sure.
	gstein says: we don't necessarily want to preserve the original
	bucket. consider a REQUEST bucket which could
	"become" an aggregate containing two buckets: one for
	the serialized request-line and headers, and one for
	the body. there is no REQUEST bucket any more.

	* If anyone knows how to use APR_RING, we could probably use that for the
	aggregate bucket list.
	gstein says: I think this implies a little entry structure which
	contains the "link", plus a pointer to the bucket. I
	was thinking of just using an apr array.

	* How does something like deflate get inserted into both the request and
	response?
	Justin says: I'm still not crystal clear on how this should work.
	I'd like the concept of a 'filter' somehow as we can
	shield the 'user programmer' from this, but I dunno.
	A 'hook' after the request is created, but before it
	is sent???
	gstein says: when assembling the request bucket, a "deflate"
	bucket can be wrapped around the body bucket. it
	reads raw content, compresses it, and returns
	that. when a response arrives, an "inflate" bucket is
	wrapped around the body bucket to inflate the
	response. (of course, the decision points are made
	based upon headers and server capabilities whatnot)

	filters are just buckets that wrap other buckets.

	* how to signify "bucket is done"? len==0? NULL data pointer? return
	APR_EOF? Note that we intend for the response handler to return
	APR_EOF to denote that the response has been completely read. I'm
	of a mind to say "return APR_EOF".

	A: return APR_EOF. data may also be returned at the same
	time. this allows a bucket to return "all available
	information" which includes data and the notation that no more
	data exists.

	* if the poll() does not return POLLHUP, then how do we detect EOF?
	we'd end up seeing POLLIN, go for a read(), and probably get a
	SIGHUP or SIGPIPE as a result. if it just keeps returning len==0,
	then we could have a problem. part of the issue here is that we
	get a POLLIN from poll(), but the reading occurs in an entirely
	different context (call a response handler, which reads a bucket,
	which trickles down to a socket bucket, and then reading the
	socket object). transmitting "there should be something to read"
	over to that other context could be difficult.

	see: http://www.greenend.org.uk/rjk/2001/06/poll.html

	* review the various OUT parameters and see if any make sense to
	allow a NULL value to designate non-interest in the result.