Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411613 Posts in 69390 Topics- by 58447 Members - Latest Member: sinsofsven

May 09, 2024, 07:42:45 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)General thread for quick questions
Pages: 1 ... 10 11 [12] 13 14 ... 69
Print
Author Topic: General thread for quick questions  (Read 135986 times)
Dacke
Level 10
*****



View Profile
« Reply #220 on: September 12, 2015, 07:05:05 PM »

Post the message?
Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism
ProgramGamer
Administrator
Level 10
******


aka Mireille


View Profile
« Reply #221 on: September 12, 2015, 07:21:46 PM »

Code:
fatal: The current branch master has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream www.WebSiteWhereIWantToUploadMyThings.com master
Here's the message
Logged

Dacke
Level 10
*****



View Profile
« Reply #222 on: September 12, 2015, 08:33:57 PM »

What command did you use? What does git remote say?

Code:
$ git remote -v

edit: My guess is that you've forgotten to connect your local repository (on your computer) to your bitbucket repository. Did you add bitbucket as a remote repository, as per the bitbucket tutorial?
https://confluence.atlassian.com/bitbucket/create-a-repository-221449521.html

Code:
$ git remote add origin ssh://[email protected]/username/bbreponame.git

« Last Edit: September 13, 2015, 05:00:08 AM by Dacke » Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #223 on: September 13, 2015, 07:31:04 AM »

wrote this

Code:
import os
import sys

File = open("C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt","r")
line = File.readline()
print(line)

got this

Code:
C:\Python34\python.exe C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 4
    File = open("C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt","r")
               ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Process finished with exit code 1

What's wrong??
Logged

Layl
Level 3
***

professional jerkface


View Profile WWW
« Reply #224 on: September 13, 2015, 07:33:24 AM »

What's wrong??

String escaping
Logged
indie11
Level 2
**


View Profile
« Reply #225 on: September 13, 2015, 07:45:47 AM »

Anyone here ever worked on a turn-based multiplayer game in Unity? If so, did you roll own your own system or some 3rd party API?
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #226 on: September 13, 2015, 07:52:58 AM »

I solved with randomly stubbling on unrelated stack overflow about another problem, Using "r" as a prefix solve it (string as raw) apparently it's the \u that is a problem and now I have

Code:
import os
import sys

File = open(r"C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt", "r")  # r before "" for raw tesxt
line = File.readline()
print(line)
l = list(File)
for line in File:
    print (line)

which lead to

Code:
Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 8, in <module>
    for line in File:
  File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 4847: character maps to <undefined>

When I comment the line

Code:
l = list(File)

I get a cut off there

Code:
colombia

2009_banja_luka_challenger

bosnia_and_herzegovina

2009_barcelona_open_banco_sabadell

barcelona

2009_barcelona_open_banco_sabadell

spain

2009_bh_telecom_indoors

Looking at the same place in the file

Quote
genoa
2009_asb_classic
auckland
2009_australian_open
victoria/n/australia
2009_bancolombia_open
colombia
2009_banja_luka_challenger
bosnia_and_herzegovina
2009_barcelona_open_banco_sabadell
barcelona
2009_barcelona_open_banco_sabadell
spain
2009_bh_telecom_indoors
bosnia_and_herzegovina
2009_bh_tennis_open_international_cup
brazil
2009_brazilian_grand_prix
autódromo_josé_carlos_pace
2009_brazilian_grand_prix
são_paulo
2009_british_grand_prix
buckinghamshire

Nothing suspect at all, the next line seems very fine ... Huh?
Logged

Dacke
Level 10
*****



View Profile
« Reply #227 on: September 13, 2015, 08:10:38 AM »

Did you look at invisible characters? Maybe there is a different kind of whitespace/linebreak at that position?

What encoding is the file? Have you tried specifically setting the right encoding?
http://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character
Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #228 on: September 13, 2015, 08:26:57 AM »

It's from the former blitz parser I wrote, so if there is a strange keycode, it should be at everyline! Maybe the original data have some strange char? Huh?


I'm looking at your link

The blitzcode in question, super straightforward ...

Code:
Include "blitz 3D test parser.bb" ;http://www.blitzbasic.com/codearcs/codearcs.php?code=161

;http://www.zytrax.com/tech/codes.htm
;HT-09-9-Horizontal Tab



; Set The Graphic Mode
Graphics 600,300,0,2





; Open the file to Read
filein = ReadFile("C:\Users\user\Desktop\part_00.csv")

file$ = ""
this = 0

currentFolder$ = CurrentDir() + "#1 ConceptNet Relations"

dictionary$ = "\dictionary.txt"
Dictfile = WriteFile (currentFolder + dictionary)


If FileType(currentFolder) <> 2 Then
Print "no folder found! - trying to create new folder"
CreateDir currentFolder
Print currentFolder
If FileType(currentFolder) <> 2 Then
Print "Creation failed!!":WaitKey()
Else
Print "creation succeed!":WaitKey()
EndIf
EndIf

Print


Print "Lines of text read from file " + filein
Print

rtemp$ = "";for list of different relation
rcount = 0






;MAINLOOP: skim through each line parsing and filtering data
While Not Eof( filein ) Or KeyHit (1)

Read1$ = ReadLine$( filein ) ; read a new line
parse( Read1, Chr(9) ) ; parse the line into chunk

prevcount = count
count = 0



;strip the non desired data from the chunk, discard useless chunk
For back.parsereturn=Each parsereturn ;flip through the chunk

;let's try to rip the member after the assertion
;If count = 1 Then relation$ = back\word ;store the relation
If count = 2 Then arg1$ = back\word ;store arguments 1
If count = 3 Then arg2$ = back\word ;store arguments 1

count = count +1
Next



If Instr(arg1,"/c/en/") <> 0 And Instr(arg2,"/c/en/") <> 0 Then
arg1=Replace (arg1,"/c/en/",""):
arg2=Replace (arg2,"/c/en/","")

ParseArg(arg1,dictfile)
ParseArg(arg2,dictfile)
EndIf






;visualization control
If KeyHit (28) Then
WaitKey()
Print "pause"
EndIf
;Print

Wend
;END MAINLOOP





CloseFile (Dictfile)
Print "end " + rcount ;50
CloseFile( filein )
;WaitKey()





;----------------------------------------------------------------



Function ParseArg(arg$, file)
parse (arg, ",")
Local count = 0

For back.parsereturn = Each parsereturn
count = count + 1
WriteLine( file , back\word )
Next

; writeline (file, element)

; argcount$ = count
; Print argcount + " " + arg

End Function



EDIT:

Oh it seems it's about the latin capital or something like that

http://www.i18nqa.com/debug/bug-double-conversion.html
http://www.fileformat.info/info/unicode/char/00cd/index.htm

 No No NO
« Last Edit: September 13, 2015, 08:32:08 AM by Jimym GIMBERT » Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #229 on: September 13, 2015, 01:19:28 PM »

Code:
 :panda:File = open(r"C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt", "r", encoding="utf8")


Code:
Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 9, in <module>
    print(line)
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u1ed9' in position 11: character maps to <undefined>

No No NO No No NO No No NO No No NO No No NO No No NO No No NO No No NO No No NO No No NO No No NO
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #230 on: September 13, 2015, 01:35:13 PM »

Okay That's weird, I opened Notepad++ convert the whole file to many encoding, still get an error no mater what, generally at consistent but different break point HALP!
Logged

Cheesegrater
Level 1
*



View Profile
« Reply #231 on: September 13, 2015, 02:04:37 PM »

Its probably not UTF-8. Have you tried latin-1? UTF-16?
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #232 on: September 13, 2015, 02:12:52 PM »

I'm trying everything, I haven't found a list of unicode parameter on python yet

EDIT: Latin-1 DID IT!
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #233 on: September 13, 2015, 02:19:16 PM »

OKAY, now I can print with repr(line) but not directly Sad better though, just need to kill the \n
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #234 on: September 13, 2015, 08:25:35 PM »

well still don't work:

The goal was to use the "set" to remove all duplicates, but as soon as I move from the repr to the actual set function it crashed because unicode .....

Rsearch show that unicode is a nightmare on python (and in general), I don't know what to do, now I file it as failure. Problem is that 2 140 288 line in the folder, trying them all for duplicate would take n² time using brute force Huh? Turn out I need a crash course in sorting no way to avoid it now.
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #235 on: September 13, 2015, 09:07:30 PM »

http://textmechanic.com/Big-File-Tool-Remove-Duplicate-Lines.html
Internet has everyting Shocked
Logged

Dacke
Level 10
*****



View Profile
« Reply #236 on: September 13, 2015, 10:11:42 PM »

If possible, try to avoid having text files in anything but utf8. Death to latin1.

Using a set for utf8 text works perfectly fine.

I made this file and encoded it in utf8:

Code:
Öñü中華民族日本語
ひらがな平仮名
2009_bh_telecom_indoors
bosnia_and_herzegovina
Öñü中華民族日本語
2009_bh_telecom_indoors

Then I wrote this python3 program:

Code:
# open file
file = open("utf8lines.txt")

# create set
unique_lines = set()

# strip and add all lines in file to set
for line in file:
   unique_lines.add(line.strip())

# print all lines in set
for line in unique_lines:
   print(line)

Which correctly outputs the unique lines:

Code:
Öñü中華民族日本語
ひらがな平仮名
2009_bh_telecom_indoors
bosnia_and_herzegovina
Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #237 on: September 13, 2015, 10:38:44 PM »

To be frank I'm parsing a text from another text that I didn't build originally (the conceptnet)

I did fucked up some character, which mean I will have miss match for the next step.

Basically I have 5gb of csv data in 5 files, I extracted all the relevant data of the first file, only from english concept, in separate file where the semantic relation is the name of the file and the argument are tab separated on each line. Then I extract all the arguments in a dictionary file, tried to remove duplicate and order them alphabetically in hope to build an index.

Then the next step would have been to replace the argument in the relation file by the concept index in the dictionary file. Then generalize for all concept, in all languages, to extract the remaining data in the 5GB database, by automatically detected other language and cross language concept.

So far it's compromised, at least with my current implementation.

I'll tried your stuff.
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #238 on: September 13, 2015, 11:38:54 PM »

nope
Code:
bar_aqueduct
jujubinus_poppei
yunshan_road_station
yves_duval
brinklow
donggyo-dong
fuentelsaz_de_soria
étienne_boulay
Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 24, in <module>
    print(line)
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0144' in position 5: character maps to <undefined>
Logged

Dacke
Level 10
*****



View Profile
« Reply #239 on: September 13, 2015, 11:46:29 PM »

You still have to make sure to get the encoding right. But that's true no matter what programming language you use, python isn't better or worse.

edit: This isn't the issue, Boris gets it right in the next post.
« Last Edit: September 13, 2015, 11:53:55 PM by Dacke » Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism
Pages: 1 ... 10 11 [12] 13 14 ... 69
Print
Jump to:  

Theme orange-lt created by panic