Discussion:
[sword-devel] module driver reorganization proposal
Chris Little
2014-03-18 06:43:45 UTC
Permalink
We've got quite a few classes in Sword that essentially duplicate code
found elsewhere in Sword, with minor changes. The module drivers are a
prime example.

Specific examples include RawText & RawText4, RawCom & RawCom4, zText &
zText4 (new as of today), zCom & zCom4 (new as of today), and RawLD &
RawLD4, each pair of which differs in that one member uses a 16-bit
value to store entry size and the other member uses a 32-bit value. (The
16-bit sizes permit entries up to 64KiB; the 32-bit sizes permit entries
up to 4GiB.)

There are also the pairs RawText & RawCom, RawText4 & RawCom4, zText &
zCom, and zText4 & zCom4, each pair of which differs very little.

My proposal is to collapse the above classes into three classes:
RawText, zText, and RawLD

Each of these classes would support entry sizes of 2, 3, or 4 bytes
(16-bit = 64KiB entries, 24-bit = 16MiB entries, 32-bit = 4GiB entries).
Internally, the classes would always store sizes as a uint32_t, but
would serialize as 2, 3, or 4 byte size integers, depending on the
parameters passed to the constructor. This will necessitate changing
many of the class method signatures to accept uint32_ts instead of
shorts & longs.

Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4, and RawLD
& RawLD4 would be condensed into zVerse, RawVerse, & RawLD capable of
reading files with 2, 3, or 4-byte entry sizes.

This would not require changes to existing modules. A RawLD4 module will
still work, but we'll use the RawLD driver to read it and parse the '4'
form the end of the driver name to determine that we will read 4-byte
entry sizes.

RawCom, zCom, & SWCom classes would then be derived from RawText, zText,
& SWText respectively. Maybe we can even eliminate the *Com classes and
simply add a member variable to indicate whether to act like a
commentary or a Bible.


Advantages of this proposal include all of the things that come with
reduced code duplication:
Less code, reduced API complexity, smaller library size, etc.
Greater consistency, without having to page through half a dozen
distinct classes to keep code consistent.
Bugs only need to be fixed in one location instead of many.
Whatever else makes DRY practices better than WET.

The method described also makes it trivial for us to add the 3-byte
entry size drivers, which should be enough for anything practical (up to
16MiB per entry). And down the road, we could add 5-byte entry size
support with ease for entry sizes up to 1TiB. (No, I'm not suggesting that.)

If you're wondering why RawGenBook & zLD are left out of the proposal,
it's because they both use 4-byte entry sizes already and no 2-byte
versions exist.

--Chris
Jaak Ristioja
2014-03-18 07:24:49 UTC
Permalink
Post by Chris Little
We've got quite a few classes in Sword that essentially duplicate
code found elsewhere in Sword, with minor changes. The module
drivers are a prime example.
Specific examples include RawText & RawText4, RawCom & RawCom4,
zText & zText4 (new as of today), zCom & zCom4 (new as of today),
and RawLD & RawLD4, each pair of which differs in that one member
uses a 16-bit value to store entry size and the other member uses a
32-bit value. (The 16-bit sizes permit entries up to 64KiB; the
32-bit sizes permit entries up to 4GiB.)
There are also the pairs RawText & RawCom, RawText4 & RawCom4,
zText & zCom, and zText4 & zCom4, each pair of which differs very
little.
RawText, zText, and RawLD
Each of these classes would support entry sizes of 2, 3, or 4
bytes (16-bit = 64KiB entries, 24-bit = 16MiB entries, 32-bit =
4GiB entries). Internally, the classes would always store sizes as
a uint32_t, but would serialize as 2, 3, or 4 byte size integers,
depending on the parameters passed to the constructor. This will
necessitate changing many of the class method signatures to accept
uint32_ts instead of shorts & longs.
Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4, and
RawLD & RawLD4 would be condensed into zVerse, RawVerse, & RawLD
capable of reading files with 2, 3, or 4-byte entry sizes.
This would not require changes to existing modules. A RawLD4 module
will still work, but we'll use the RawLD driver to read it and
parse the '4' form the end of the driver name to determine that we
will read 4-byte entry sizes.
RawCom, zCom, & SWCom classes would then be derived from RawText,
zText, & SWText respectively. Maybe we can even eliminate the *Com
classes and simply add a member variable to indicate whether to act
like a commentary or a Bible.
Advantages of this proposal include all of the things that come
with reduced code duplication: Less code, reduced API complexity,
smaller library size, etc. Greater consistency, without having to
page through half a dozen distinct classes to keep code
consistent. Bugs only need to be fixed in one location instead of
many. Whatever else makes DRY practices better than WET.
The method described also makes it trivial for us to add the
3-byte entry size drivers, which should be enough for anything
practical (up to 16MiB per entry). And down the road, we could add
5-byte entry size support with ease for entry sizes up to 1TiB.
(No, I'm not suggesting that.)
If you're wondering why RawGenBook & zLD are left out of the
proposal, it's because they both use 4-byte entry sizes already and
no 2-byte versions exist.
Good idea! But I'm not sure whether passing such an argument to the
constructor is a good idea. I'd rather use templates and type traits
for this if we only want to support a few cases (16, 24, 32 and maybe
40 bits).

Blessings,
Jaak
Jonathan Morgan
2014-03-18 13:46:09 UTC
Permalink
Hi Jaak,
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Chris Little
We've got quite a few classes in Sword that essentially duplicate
code found elsewhere in Sword, with minor changes. The module
drivers are a prime example.
Specific examples include RawText & RawText4, RawCom & RawCom4,
zText & zText4 (new as of today), zCom & zCom4 (new as of today),
and RawLD & RawLD4, each pair of which differs in that one member
uses a 16-bit value to store entry size and the other member uses a
32-bit value. (The 16-bit sizes permit entries up to 64KiB; the
32-bit sizes permit entries up to 4GiB.)
There are also the pairs RawText & RawCom, RawText4 & RawCom4,
zText & zCom, and zText4 & zCom4, each pair of which differs very
little.
RawText, zText, and RawLD
Each of these classes would support entry sizes of 2, 3, or 4
bytes (16-bit = 64KiB entries, 24-bit = 16MiB entries, 32-bit =
4GiB entries). Internally, the classes would always store sizes as
a uint32_t, but would serialize as 2, 3, or 4 byte size integers,
depending on the parameters passed to the constructor. This will
necessitate changing many of the class method signatures to accept
uint32_ts instead of shorts & longs.
Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4, and
RawLD & RawLD4 would be condensed into zVerse, RawVerse, & RawLD
capable of reading files with 2, 3, or 4-byte entry sizes.
This would not require changes to existing modules. A RawLD4 module
will still work, but we'll use the RawLD driver to read it and
parse the '4' form the end of the driver name to determine that we
will read 4-byte entry sizes.
RawCom, zCom, & SWCom classes would then be derived from RawText,
zText, & SWText respectively. Maybe we can even eliminate the *Com
classes and simply add a member variable to indicate whether to act
like a commentary or a Bible.
Advantages of this proposal include all of the things that come
with reduced code duplication: Less code, reduced API complexity,
smaller library size, etc. Greater consistency, without having to
page through half a dozen distinct classes to keep code
consistent. Bugs only need to be fixed in one location instead of
many. Whatever else makes DRY practices better than WET.
The method described also makes it trivial for us to add the
3-byte entry size drivers, which should be enough for anything
practical (up to 16MiB per entry). And down the road, we could add
5-byte entry size support with ease for entry sizes up to 1TiB.
(No, I'm not suggesting that.)
If you're wondering why RawGenBook & zLD are left out of the
proposal, it's because they both use 4-byte entry sizes already and
no 2-byte versions exist.
Good idea! But I'm not sure whether passing such an argument to the
constructor is a good idea. I'd rather use templates and type traits
for this if we only want to support a few cases (16, 24, 32 and maybe
40 bits).
As I understand it, this configuration is coming from the conf file at
least sometimes (and maybe always). That sounds to me much more like a
property is needed, not a type trait. Does that make sense?

Jon
Blessings,
Jaak
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQgcBAEBAgAGBQJTJ/S4AAoJELozJlbjIn79gM8//2VgFdrTWD8L2YXGlOt2BGZp
VRzV+FzNlcuKMYBxSbbyVes0px4MePZdlWfiIyBclRtlZOkHsEe1X80Cmz/S/YKk
g0Krb40d2qntUPb4Qy3ap7tqJneMyfkp+SLTmCCf7L85kbitcg4kHhvBZVoLJf3+
/f4Nyvu0EerjytHfSdM+Zia4BYaFnEGfmyU9oRl3o7EvsKI++08dyZqqGwFy6h37
WvBCrcbCXLoaEHuQTYnGR4g6973weuMI87dSVcQgS+JR9mD/vmTMtmFefPtFCpWS
5nSm6jMykhDJt1qG+Kpe9gcmzsKkkHTVgj1z7xNceiCcVgsS6Ks72ES0v0AvrsZ4
7WMRPvsQOwwf1XI6jTMeCpRFN3TXx/aNmTUpslGEVJE0JdBvqz4V2TSWPmG6AJHW
/z0XiStljg8y3gjzgmVv4WzedDRLFduUyxnAtb5g8+2YmVt82XzbC3hKxJayZi1V
RwSq4h/NjHPzqNgLs7z7mgnChPz963OFG7o0/L/6X+QYlDvkU4bATiIjmGEuDtEG
2rmsjS+03Aj9CsfPO7AAvgSjFuvh3/aW3wwclzZnrAFAubkLoHIrBJux98ittMdt
UWXxy1NXbqOZQI9AqohU3xnq2MsC+AjInEedjZoNol+lDHYP6tDDvxSQqDSjGT1u
xoBLR7+ymxHq3arufIrzW4xt1X2WYRkUcceaVKzXE41sMRZKQlgbdTGxt5JWcFzq
YozFjR9EdIqIRNUGdmdub3/a65749VsyokoApkJ+KW/NRAdc+749pwfZ0nSGRYo/
OjnJqaDAgR8Pirc6p+m2iLy7EYvcdbXLHZGsRGr8uW6khc8d7XT5OpispjuP56jg
BMMPyZzPGqygTcjGD3age6WGVwUh1wJf9N+mlZwjjbsKIPOTpzSD1v65XTiWe7fo
4RivUaJ8lqFYzOp0CqwC5Cd7DHmA6vhf0JJoYg4UHJhXPo9m++NcEIEYqADmWLS9
lDjb+TxrA9l8SgyT2OPNBIpmUKUmV1zT/8VWjWp4T588ZeS0Sp6TFFS2g41NYUYo
hcdafx0hnORrdo7KNLI+qdS5ajVvuqfJAwLTrfOJc9WJJKFQsufDaMqdE8xuMlIU
yXP27R7a6lvuMP96xa0Xj2+s0EcUNeeEWGRC4CRi/FA+mTcIJZGdYAKbeggxeNsC
pedrIK6uFJVqQCCmHvfNDGF3ZleSC2MJK4hwTrJmjVnQMOO8AO7mXYFVq/mDGQ70
rABVuRKPBVsycZlEggLGv4Q5eZ7iXQmXt2N+eDMgCICLNu44VPXQBvbBKnMxbvRT
3vmcihB8reR9jRO2NKB9DfKKVV0WvFJrU+S3Z3KrhHxrBVS7Dw104NAMvNBTU2sY
+VlJUXV745/43gOBsiTbUWRTHwRtijbjLeqri24Fz9pdh5QxSDo+akE2iN4cgNPF
NILKU4pWZ53OA6eIYwbAMWY/l5npOh266+fNPlGmK0t/dSHlkEdttGhS69nnRIXY
odwujEImWDhCPInCw0jVJ5pss9j9TNSEGnY5DdN7CmUyyATtFKbOE1V70cH01DUt
gfxHqbEtyeoxPo7+m6LB4xriNlKYt6RlmWVTPPtIdzx6vLepoJuYJ+7HKHJWl/st
KAdeLyFUg6kjpYbUBB+JsFsX8lY2b+9nSFB8n3mCPCDoGwNAco++nlZ/AmgykprC
o+GLPDhrwqx7MCiXCvj1+RfmCR5HDjLA4IlTPDhPOnYBWUZt55evFEZ4lF0GsGqi
Ok2269aPNc5loJLZBunSEVFUmdz6RpBngIFLHgOmfV9BgCH4L7KyaUPy7JbekI4Q
gFQfIRS9hmxgFRjz8FGOI6L50HJpv8kFG8iy4ibtL3lIvXIHg6iHO1G+FRig9eey
fs1Ge3A7eDysiWA+54oqa2x+eVhqdYIeZzNRnotBT/v+UHYOQ/K3dGU3+0zX74Gi
RVO51/5g+gx/ZNW2CXZJZAPlN7U7spXjtnaOmAmG8NFXK+ezTgxY7SElQZYbVZgT
NLWZ9SBj4Ykbey0AymoYeSLNFFCp3kvtSaE/vahxgvZsrk7oc6tTCIh0OM7hdzwe
xWJjviTDcU9FY+eYkl4RH4R9KplLdOdL5CrUJ+QX2YDW+0UUTP5zD+z5BuOm4NoJ
d8NWywFljMjgfu9bj0EW+ZsWJvz7T9s2EsoEvosukQjzC6cR9GgIhkA2QSre3g93
6e758bLZSbbRLyBPjsWYoSzpYbfGpHcqFTGN93+GjlQX0oSg6V+/O4GwuqC4Ztq+
Ck3bqnMviTrOyEqS1UQVOBbDuu3jnho8TaPx5fY82G+W/3rhw3Iq0FqVas5L2Ey0
GG6Gv/topXODsf2KhhmheD+WR3Ubvd0kFGBzaNQMwux7eR7cJPU8y2mGmFHTas21
YuEE+nG5LLL6dnDYKoRxYrBgVm49bvny9e5dlYJqhMiKNAG/afraSvzRXQhyLrP1
ddu5PP6lzqSW0EV62v15BcZaOtnzB8Ig1mWe/qrFIJUFfskXhfSDbfkC2HjyuGC/
lCPGobscnSqx5EszWVuTadKKVcXtNsEaBbXT0p73+WzxGgIt/lMEKUq4u8n+LKlM
5KNfNV8csxkvZRZr5UGwEwm+G1xIb+cy4w73B/Ld2paOwUWGfcOl1fwVuua67gfB
3OIc4P+JHwwGwppcKecPOX/UvjLpLrqb1FNOrmcdcfSnHXaMgnPdRsSB7hR6uuFc
9TqqZv7ml5T7r8SMd8Ji
=k7Vn
-----END PGP SIGNATURE-----
_______________________________________________
sword-devel mailing list: sword-devel at crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20140319/6b78016e/attachment-0001.html>
Jaak Ristioja
2014-03-18 14:08:17 UTC
Permalink
On Tue, Mar 18, 2014 at 6:24 PM, Jaak Ristioja <jaak at ristioja.ee
Post by Chris Little
We've got quite a few classes in Sword that essentially
duplicate code found elsewhere in Sword, with minor changes. The
module drivers are a prime example.
Specific examples include RawText & RawText4, RawCom & RawCom4,
zText & zText4 (new as of today), zCom & zCom4 (new as of
today), and RawLD & RawLD4, each pair of which differs in that
one member uses a 16-bit value to store entry size and the other
member uses a 32-bit value. (The 16-bit sizes permit entries up
to 64KiB; the 32-bit sizes permit entries up to 4GiB.)
There are also the pairs RawText & RawCom, RawText4 & RawCom4,
zText & zCom, and zText4 & zCom4, each pair of which differs
very little.
RawText, zText, and RawLD
Each of these classes would support entry sizes of 2, 3, or 4
bytes (16-bit = 64KiB entries, 24-bit = 16MiB entries, 32-bit =
4GiB entries). Internally, the classes would always store sizes
as a uint32_t, but would serialize as 2, 3, or 4 byte size
integers, depending on the parameters passed to the constructor.
This will necessitate changing many of the class method
signatures to accept uint32_ts instead of shorts & longs.
Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4,
and RawLD & RawLD4 would be condensed into zVerse, RawVerse, &
RawLD capable of reading files with 2, 3, or 4-byte entry sizes.
This would not require changes to existing modules. A RawLD4
module will still work, but we'll use the RawLD driver to read it
and parse the '4' form the end of the driver name to determine
that we will read 4-byte entry sizes.
RawCom, zCom, & SWCom classes would then be derived from
RawText, zText, & SWText respectively. Maybe we can even
eliminate the *Com classes and simply add a member variable to
indicate whether to act like a commentary or a Bible.
Advantages of this proposal include all of the things that come
with reduced code duplication: Less code, reduced API
complexity, smaller library size, etc. Greater consistency,
without having to page through half a dozen distinct classes to
keep code consistent. Bugs only need to be fixed in one location
instead of many. Whatever else makes DRY practices better than
WET.
The method described also makes it trivial for us to add the
3-byte entry size drivers, which should be enough for anything
practical (up to 16MiB per entry). And down the road, we could
add 5-byte entry size support with ease for entry sizes up to
1TiB. (No, I'm not suggesting that.)
If you're wondering why RawGenBook & zLD are left out of the
proposal, it's because they both use 4-byte entry sizes already
and no 2-byte versions exist.
Good idea! But I'm not sure whether passing such an argument to
the constructor is a good idea. I'd rather use templates and type
traits for this if we only want to support a few cases (16, 24, 32
and maybe 40 bits).
Post by Chris Little
As I understand it, this configuration is coming from the conf
file at least sometimes (and maybe always). That sounds to me
much more like a property is needed, not a type trait. Does that
make sense?
What do you mean by "property"?

Best regards,
Jaak
DM Smith
2014-03-18 14:58:05 UTC
Permalink
Chris,

Your suggestion is very similar to JSword's implementation. It has simplified code maintenance.

There are three types of module files: index, compression index and data files. It may do well to handle these separately.
The index consists of fixed sized entries consisting of parts. For a raw module it is: offset and size. For a compressed module it is: block, offset and size.
The block and offset are always 32bits. But it is the size that varies in width. Today, either 2 or 4 bytes.

So I'd suggest two more classes: RawIndex and a sub-class ZIndex. (Maybe 4, also struct/class RawIndexEntry and ZIndexEntry).

A couple of observations. A row in the file is of fixed width. The size of the file divided by the width of the row gives the number of entries. Finding the i-th entry is simple and obvious.

We've started the above, but still have code duplication related to the index code being in more than one module driver.

Also, I don't see the point of the 3 byte entry. The only thing it affects is the size of the index file. In memory it will be 32bit. For a Bible it would save about 65K to have a 3 byte rather than a 4 byte. Rather I'd suggest that from now on our module making tools only make 4 byte index files. For a Bible, this would add about 128K to the module size.

In Him,
DM
We've got quite a few classes in Sword that essentially duplicate code found elsewhere in Sword, with minor changes. The module drivers are a prime example.
Specific examples include RawText & RawText4, RawCom & RawCom4, zText & zText4 (new as of today), zCom & zCom4 (new as of today), and RawLD & RawLD4, each pair of which differs in that one member uses a 16-bit value to store entry size and the other member uses a 32-bit value. (The 16-bit sizes permit entries up to 64KiB; the 32-bit sizes permit entries up to 4GiB.)
There are also the pairs RawText & RawCom, RawText4 & RawCom4, zText & zCom, and zText4 & zCom4, each pair of which differs very little.
RawText, zText, and RawLD
Each of these classes would support entry sizes of 2, 3, or 4 bytes (16-bit = 64KiB entries, 24-bit = 16MiB entries, 32-bit = 4GiB entries). Internally, the classes would always store sizes as a uint32_t, but would serialize as 2, 3, or 4 byte size integers, depending on the parameters passed to the constructor. This will necessitate changing many of the class method signatures to accept uint32_ts instead of shorts & longs.
Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4, and RawLD & RawLD4 would be condensed into zVerse, RawVerse, & RawLD capable of reading files with 2, 3, or 4-byte entry sizes.
This would not require changes to existing modules. A RawLD4 module will still work, but we'll use the RawLD driver to read it and parse the '4' form the end of the driver name to determine that we will read 4-byte entry sizes.
RawCom, zCom, & SWCom classes would then be derived from RawText, zText, & SWText respectively. Maybe we can even eliminate the *Com classes and simply add a member variable to indicate whether to act like a commentary or a Bible.
Less code, reduced API complexity, smaller library size, etc.
Greater consistency, without having to page through half a dozen distinct classes to keep code consistent.
Bugs only need to be fixed in one location instead of many.
Whatever else makes DRY practices better than WET.
The method described also makes it trivial for us to add the 3-byte entry size drivers, which should be enough for anything practical (up to 16MiB per entry). And down the road, we could add 5-byte entry size support with ease for entry sizes up to 1TiB. (No, I'm not suggesting that.)
If you're wondering why RawGenBook & zLD are left out of the proposal, it's because they both use 4-byte entry sizes already and no 2-byte versions exist.
--Chris
_______________________________________________
sword-devel mailing list: sword-devel at crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4145 bytes
Desc: not available
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20140318/d4ccc36b/attachment.p7s>
Troy A. Griffitts
2014-03-18 19:21:27 UTC
Permalink
Post by DM Smith
Chris,
Your suggestion is very similar to JSword's implementation. It has simplified code maintenance.
There are three types of module files: index, compression index and data files. It may do well to handle these separately.
The index consists of fixed sized entries consisting of parts. For a raw module it is: offset and size. For a compressed module it is: block, offset and size.
The block and offset are always 32bits. But it is the size that varies in width. Today, either 2 or 4 bytes.
SWORD C++ has this same separation. The work to deal with the index
files is primarily isolated in src/modules/common/. These classes are
used by the drivers to do the work. They are not intended as part of
the outward-facing interface to the engine, but simply for code reuse in
the implementation details.

I support the removal and *4 variants of the drivers. This probably
should have always been an extended property of the original drivers.
Post by DM Smith
Also, I don't see the point of the 3 byte entry. The only thing it affects is the size of the index file. In memory it will be 32bit. For a Bible it would save about 65K to have a 3 byte rather than a 4 byte.
I don't mind the 3 byte derivative. There is no reason to store an
extra byte in every record of the index if it will never practically be
used (not once by any module we currently have).
Post by DM Smith
Post by Chris Little
RawText, zText, and RawLD
I like the removal of the *4 classes, but I don't like the collapse of
the Text and Com concepts. I have never like that SWCom is basically
duplicate logic from SWText with very very differences (I think
increment and decrement might be different, but not ever sure anymore.
This might suggest I'd be in support of the collapse, but I am not. The
concept of a Commentary is very different than that of the text on which
it comments. I realize that the current implementation in SWORD is
practically identical right now, as we only have per-verse commentaries,
but the concept is important to keep separate. SWCom should not extend
SWText. Even though this would allow us to save some code duplication
right now, most of that duplication has already been factor into the
src/modules/common/*verse classes that deal with per-verse material. I
think it is important to keep the concept of a commentary distinct for
future purposes we cannot foresee right now. Though I can sympathize
with the desire to remove the redundancy.
Post by DM Smith
Post by Chris Little
Internally, the classes would always store sizes as a uint32_t, but would serialize as 2, 3, or 4 byte size integers, depending on the parameters passed to the constructor. This will necessitate changing many of the class method signatures to accept uint32_ts instead of shorts & longs.
These method signatures should primarily be isolated to the internal
src/modules/common/ classes and shouldn't effect the public API much, if
at all. We started using our own exact primitive types a few years back
for implementation details which needed exact types across all
platforms, so please use __u32.
Post by DM Smith
Post by Chris Little
This would not require changes to existing modules. A RawLD4 module will still work, but we'll use the RawLD driver to read it and parse the '4' form the end of the driver name to determine that we will read 4-byte entry sizes.
I like this for backward compatibility, but I think we should have an
EntrySize or similar in the .conf files. This let's us maintain the
ModDrv=<real SWORD class name> paradigm in case we ever really do make
it to dynamically loadable drivers in some future release.
Post by DM Smith
Post by Chris Little
RawCom, zCom, & SWCom classes would then be derived from RawText, zText, & SWText respectively. Maybe we can even eliminate the *Com classes and simply add a member variable to indicate whether to act like a commentary or a Bible.
See comments from earlier. I think it is important to maintain the
SWCom distinction and that SWCom should not conceptually inherit SWText,
though I do understand why you propose such to reduce code duplication.

Thanks for the proposal Chris,

Troy
Peter von Kaehne
2014-03-21 06:53:48 UTC
Permalink
Post by Troy A. Griffitts
I think it is important to maintain the
SWCom distinction and that SWCom should not conceptually inherit SWText,
though I do understand why you propose such to reduce code duplication.
Complex commentaries I have seen will re-run a chapter or book several
times - synopsis, in detail and in more detail etc. And then throw a few
extra articles in on subjects raised by the text. And right now we are
hard pressed to cover that easily. That is right. So commentaries and
Bible text are/should be ideally different.

I would guess though - and I am far from sure if this makes a blind bit
of difference in terms of code saving, if the inheritance was the other
way round - SWText inheriting from SWCom and basically offering a
simplified version of the same if this would not be a possible way to
go.

In general, it would be useful if either drivers would allow more
complexity - study bibles are pretty hard to implement completely.

Peter
John Austin
2014-03-28 06:50:49 UTC
Permalink
This sounds great! Hopefully this will also result in adding support for
compressing RawCom4 commentaries? Currently this is not possible, but
would sure be great to have. Thanks Chris!
-John
Post by Chris Little
We've got quite a few classes in Sword that essentially duplicate code
found elsewhere in Sword, with minor changes. The module drivers are a
prime example.
Specific examples include RawText & RawText4, RawCom & RawCom4, zText &
zText4 (new as of today), zCom & zCom4 (new as of today), and RawLD &
RawLD4, each pair of which differs in that one member uses a 16-bit
value to store entry size and the other member uses a 32-bit value. (The
16-bit sizes permit entries up to 64KiB; the 32-bit sizes permit entries
up to 4GiB.)
There are also the pairs RawText & RawCom, RawText4 & RawCom4, zText &
zCom, and zText4 & zCom4, each pair of which differs very little.
RawText, zText, and RawLD
Each of these classes would support entry sizes of 2, 3, or 4 bytes
(16-bit = 64KiB entries, 24-bit = 16MiB entries, 32-bit = 4GiB entries).
Internally, the classes would always store sizes as a uint32_t, but
would serialize as 2, 3, or 4 byte size integers, depending on the
parameters passed to the constructor. This will necessitate changing
many of the class method signatures to accept uint32_ts instead of
shorts & longs.
Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4, and RawLD
& RawLD4 would be condensed into zVerse, RawVerse, & RawLD capable of
reading files with 2, 3, or 4-byte entry sizes.
This would not require changes to existing modules. A RawLD4 module will
still work, but we'll use the RawLD driver to read it and parse the '4'
form the end of the driver name to determine that we will read 4-byte
entry sizes.
RawCom, zCom, & SWCom classes would then be derived from RawText, zText,
& SWText respectively. Maybe we can even eliminate the *Com classes and
simply add a member variable to indicate whether to act like a
commentary or a Bible.
Advantages of this proposal include all of the things that come with
Less code, reduced API complexity, smaller library size, etc.
Greater consistency, without having to page through half a dozen
distinct classes to keep code consistent.
Bugs only need to be fixed in one location instead of many.
Whatever else makes DRY practices better than WET.
The method described also makes it trivial for us to add the 3-byte
entry size drivers, which should be enough for anything practical (up to
16MiB per entry). And down the road, we could add 5-byte entry size
support with ease for entry sizes up to 1TiB. (No, I'm not suggesting that.)
If you're wondering why RawGenBook & zLD are left out of the proposal,
it's because they both use 4-byte entry sizes already and no 2-byte
versions exist.
--Chris
_______________________________________________
sword-devel mailing list: sword-devel at crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
Loading...