[Cz-biology] Auto-created articles about genes
Larry Sanger
sanger at citizendium.org
Thu Sep 20 15:04:14 CDT 2007
Sorry, a sentence in there should read: "I don't know how likeLY it would be
in this particular case, but I wouldn't want to TAKE make my chances
myself."
> -----Original Message-----
> From: cz-biology-bounces at mail.citizendium.org
> [mailto:cz-biology-bounces at mail.citizendium.org] On Behalf Of
> Larry Sanger
> Sent: Thursday, September 20, 2007 4:01 PM
> To: 'Biology Workgroup List'
> Subject: Re: [Cz-biology] Auto-created articles about genes
>
>
> Thanks for this, Andrew. My comments:
>
> > >********************************
> > > How does this effort relate to the many other gene
> > databases/portals
> > >already available?
> >
> > There are two main advantages of this effort, both of which
> > stem from the fact that CZ is a wiki. First, all the other
> > gene portals (our SymAtlas database included) are primarily
> > composed to tag-value pairs (e.g., symbol = "APP", function =
> > "apoptosis", etc.). Second, other gene portals are 99%
> > one-way communication, from data providers to data consumers.
> > Of course, we all here know that wikis are a great
> > complementary resource to these types of databases, allowing
> > both free-text and user-contributed gene annotation.
>
> That makes sense.
>
> > >********************************
> > > since a parallel effort is intended at Wikipedia, will the
> > intent be
> > >substantially different? Other than our use of subpages,
> > how might our
> > >articles/clusters differ from Wikipedia's? If they
> wouldn't differ
> > >appreciably, is that a reason for us not to do it (or, indeed, to
> > >insist on doing it)?
> >
> > I've been thinking about this issue quite a bit, since there
> > is a compelling argument that doing parallel efforts at CZ
> > and WP dilutes
> > the impact and contributions of both.
>
> Indeed, this is something you ought to think more about.
> Most important is this consideration: you evidently want
> geneticists to work systematically and much on these
> articles. But if that work is done in the context of
> Wikipedia, there is an excellent chance it will simply go to
> waste. There are various reasons for this. The geneticists
> will grow disgusted, as very many experts become, with
> Wikipedia, and stop working. The work that they do then
> "rots" on Wikipedia, as no one adequately knowledgeable
> maintains it. There is also this danger on CZ--but it would
> be less of a danger, I think, in the long run.
>
> You could also find articles, types of information, work of
> particular individuals, etc., all in the crosshairs of
> overzealous Wikipedia admins. I don't know how like it would
> be in this particular case, but I wouldn't want to make my
> chances myself. I mean, for example, if I wanted to upload a
> database of information about great philosophers and
> philosophical texts, say I certainly wouldn't want it left in
> the hands of Wikipedia admins. You have to understand that
> you, as an organization and as experts, *don't have any
> official authority* on Wikipedia. Decisions about your
> information are not in your hands, they are ultimately in the
> hands of people who are, I'm sorry to say, heavily anonymous
> and immature.
>
> Third, there are two complementary problems. On the one
> hand, you split geneticist participation in a wiki gene
> encyclopedia between WP and CZ; on the other hand, you forego
> the possibility of a focused and unified effort in the expert
> hands of CZ editors and processes. Considering that most
> wiki initiatives fail, period, this may be the most important
> point of all. I would also be much more apt to spend my own
> time, recruiting geneticists, if the project were exclusively
> a CZ project. I wouldn't take so much interest, frankly, if
> it were competing with WP.
>
> In fact, and this is also important, if the CZ articles were
> left largely untouched and the WP articles experienced some
> development, I would want to delete them from CZ. There's no
> point in having two copies of this same sort of resource if
> they aren't both moving forward.
>
> In short, CZ is the right home for this sort of project, and
> splitting the scientist population and the mindshare is,
> frankly, a non-starter as far as I'm concerned. Obviously,
> though, this is up to you.
>
> There is one other consideration. It might be better to
> begin life on CZ and, if there isn't enough interest, then
> switch to the inferior solution. This is probably the best
> way to maximize the success of your project--more than either
> starting exclusively on WP, or splitting the difference.
>
> > >********************************
> > > It is to be watched whether a pharma company might have any
> > commercial
> > >interest, even one not evident to you, in influencing the
> content in
> > >any way of an article they are involved with.
> >
> > A valid point, and we welcome the scrutiny. First, it's
> > worth pointing out that potential biases pertain to hand-made
> > edits as well. The fact that we're talking about a bot to
> > make automated edits changes the number of contributions I'm
> > (indirectly) making, and not the fact that I work for a
> > company. Unless CZ plans on excluding all contributors who
> > work for commercial entities, then I think this comes down to
> > a person-by-person evaluation of credentials when approving
> > authorship and editorship and ongoing evaluation of contributions.
>
> Well, I would emphasize a different point. Insofar as a
> pharma-funded organization is supplying the bot and data, we
> can already see exactly what the information is they're
> supplying, and the external links, etc. We can RIGHT NOW
> make a judgment if there is something unfair going on. The
> question isn't really whether a pharma company is benefitted;
> the whole world would benefit from a kick-ass Citizendium.
> The question is whether the data *unfairly* benefits an
> entity and does so by our information unfairly preferring one
> entity over another.
>
> If you biologists, familiar with the players and resources
> available, assure me this isn't the case based on the example
> provided, I think I'm comfortable with the situation.
>
> > Third, as was pointed out in an email that Larry forwarded,
> > the functions of the bot and the rules by which it operates
> > are completely transparent.
>
> Exactly.
>
> > As I see it, the only
> > potential conflict of interest is the link from the gene
> > stubs to SymAtlas (the free and public gene portal that we
> > created) and the SymAtlas images displayed on the "Gallery"
> > subpage.**
>
> The questions, clearly, are (1) whether there is another free (or
> very-commonly-subscribed-to) resource that is as good or
> better. Anyone know? And (2) whether the (image)
> information is actually useful to geneticists.
>
> > ** it turns out that I actually didn't set up the APP example
> > stub how I'd really like to see it. I intended to put a link
> > directly back to SymAtlas, where additional gene expression
> > data sets are available. Take a look at the WP pages linked
> > above to see basically how I'd propose linking them here
> > ("More reference expression data" link).
> >
> >
> > >********************************
> > > And what is the long-term plan here? And why is the
> > license an issue?
> >
> > Well, no one asked that first question, but it certainly
> > relates to the second. Eventually I'd like to incorporate
> > gene wiki content directly into SymAtlas (actually SymAtlas'
> > successor, being developed now.), including reciprocal links.
> > One link will take CZ/WP users to SymAtlas and its
> > additional gene expression data sets. Similarly, SymAtlas
> > will display the community-contributed wiki content and link
> > back to CZ/WP.
>
> That's yet another reason, by the way, to have only one home
> for these articles, and that it be CZ: here, geneticists can
> act as editors, and someone who uses our data doesn't have to
> negotiate between different CZ and WP versions of articles.
>
> Personally, I don't have any problem with a corporation
> profiting from CZ's information, as long as they--as in this
> case--bring something significant to the table.
>
> > >********************************
> > > And what are the next immediate steps?
> >
> > The next step as far as CZ will be to test whether the WP bot
> > will work with little/no modifications. There were no
> > objections from the CZ-Tools group, so we hope to do this in
> > the next week or two.
>
> That sounds good.
>
> > The WP bot trial period is done, so we
> > expect to go into mass production mode there later this week.
>
> Again, I think that's a bad idea. I would be forced to
> reconsider my stance.
>
> > Although hiccups aren't unexpected, I hope to have at least
> > a thousand or so automated and semi-automated WP edits done
> > in the next month. Not long after that, I hope to draft a
> > manuscript to submit to an academic journal. If the CZ bot
> > test goes as expected, I think it would be possible to
> > quickly catch up over here (assuming there continues to be
> > support for it here and the licensing issue can be worked
> > out) so that the CZ effort can also be mentioned/highlighted
> > in the manuscript.
>
> The licensing issue can be worked out very quickly, I think.
> So far I've seen no objections from the biologists, and I
> don't know that I would even ask the Editorial Council for
> their opinion on the licensing question,
> frankly: such legal questions ultimately must be decided by
> the legal owners/trustees of the project. Of course, if it
> turned out to be extremely unpopular, my decision might be influenced.
>
> --Larry
>
> _______________________________________________
> Cz-biology mailing list
> Cz-biology at mail.citizendium.org
> http://mail.citizendium.org/mailman/listinfo/cz-biology
>
More information about the Cz-biology
mailing list